我正在嘗試獲取每個鏈接的字串值。(例如,像賓夕法尼亞州)
<li class="facetbox-shownrow ">
<a href="/bill/116th-congress/house-bill/9043/cosponsors?r=1&s=1&q={"search":["H.R.9043","H.R.9043"],"cosponsor-state":"Pennsylvania"}" title="include this search constraint" id="facetItemcosponsor-statePennsylvania">
Pennsylvania <span id="facetItemcosponsor-statePennsylvaniacount" class="count">[1]</span> </a>
</li>
</a>
但是因為有title和id屬性,我有點不明白怎么做。當我顯示我的陣列時,我得到一個空結果。這是我的代碼:
for link in links_array:
main_url_link = base_url_link link
html_page_link = requests.get(main_url_link)
soup_link = BeautifulSoup(html_page_link.text, 'html.parser')
allData_link = soup_link.findAll('li',{'class':'facetbox-shownrow'})
distric = [y.text_content() for y in allData_link]
district_array.append(distric)
district_array
uj5u.com熱心網友回復:
用于.stripped_strings在您的選擇中生成元素字串串列并選擇/切片結果 - 在這種情況下,選擇第一個元素以獲得賓夕法尼亞州:
[list(x.stripped_strings)[0] for x in soup.find_all('li',{'class':'facetbox-shownrow'})]
注意 在新代碼中find_all()應該使用,findAll()實際上仍然有效但是是非常舊的語法
要獲得href:
[x.a['href'] for x in soup.find_all('li',{'class':'facetbox-shownrow'})]
例子
有多個li標簽:
from bs4 import BeautifulSoup
html="""
<li hljs-string">">
<a href="/bill/116th-congress/house-bill/9043/cosponsors?r=1&s=1&q=%7B"search"%3A%5B"H.R.9043"%2C"H.R.9043"%5D%2C"cosponsor-state"%3A"Pennsylvania"}" title="include this search constraint" id="facetItemcosponsor-statePennsylvania">
Pennsylvania <span id="facetItemcosponsor-statePennsylvaniacount" hljs-string">">[1]</span> </a>
</li>
<li hljs-string">">
<a href="/bill/116th-congress/house-bill/9043/cosponsors?r=1&s=1&q=%7B"search"%3A%5B"H.R.9043"%2C"H.R.9043"%5D%2C"cosponsor-state"%3A"Pennsylvania"}" title="include this search constraint" id="facetItemcosponsor-statePennsylvania">
Main <span id="facetItemcosponsor-statePennsylvaniacount" hljs-string">">[1]</span> </a>
</li>
<li hljs-string">">
<a href="/bill/116th-congress/house-bill/9043/cosponsors?r=1&s=1&q=%7B"search"%3A%5B"H.R.9043"%2C"H.R.9043"%5D%2C"cosponsor-state"%3A"Pennsylvania"}" title="include this search constraint" id="facetItemcosponsor-statePennsylvania">
California <span id="facetItemcosponsor-statePennsylvaniacount" hljs-string">">[1]</span> </a>
</li>
"""
soup=BeautifulSoup(html,"html.parser")
[list(x.stripped_strings)[0] for x in soup.find_all('li',{'class':'facetbox-shownrow'})]
輸出
['Pennsylvania', 'Main', 'California']
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/386244.html
