我想提取意見之前的數字,我可以找到span包含它的數字,但我無法檢索它。
代碼示例:
list_rest =[]
for res_name, res_stats in zip(top_rest, top_rest_info):
dataframe ={}
dataframe["pos"] = res_name.find('a').contents[0]
dataframe["name"] = res_name.find('a').contents[-1]
dataframe["number_of_reviews"] = res_stats.find("span", attrs={"class": "NoCoR"})
list_rest.append(dataframe)
輸出:
[{'pos': 'La Gourmesa',
'name': 'La Gourmesa',
'number_of_reviews': <span class="NoCoR">3<!-- --> opiniones</span>},
{'pos': '1',
'name': 'Parrilla Urbana División del Norte',
'number_of_reviews': <span class="NoCoR">486<!-- --> opiniones</span>},
{'pos': '2',
'name': 'La Mansion Marriott Reforma',
'number_of_reviews': <span class="NoCoR">730<!-- --> opiniones</span>},
{'pos': '3',
'name': 'Restaurante Condimento Emporio Reforma',
'number_of_reviews': <span class="NoCoR">283<!-- --> opiniones</span>},
{'pos': '4',
'name': "Porfirio's Coapa",
'number_of_reviews': <span class="NoCoR">468<!-- --> opiniones</span>}]
如何提取評論數量?
uj5u.com熱心網友回復:
在這里,我HTML以理解您可以使用get_text()或text方法從標簽中提取文本并根據空間拆分并提取第一個欄位作為示例
html="""<span class='NoCoR'>3<!-- --> opiniones</span>
<span >486<!-- --> opiniones</span>
<span >730<!-- --> opiniones</span>"""
from bs4 import BeautifulSoup
soup=BeautifulSoup(html,"html.parser")
main_data=soup.find_all("span",attrs={"class":"NoCoR"})
for data in main_data:
print(data.get_text().split(" ")[0])
輸出:
3
486
730
對于您的代碼,它應該像這樣作業:
dataframe["number_of_reviews"] = res_stats.find("span", attrs={"class": "NoCoR"}).get_text().split(" ")[0]
uj5u.com熱心網友回復:
您仍在使用解決方案,那么為什么不也從標簽中獲取數字呢?
解決方案
標簽的子項在名為的串列中可用,.contents因此選擇第一個應該可以解決您的問題 - 附加.contents[0]到您的代碼行:
res_stats.find("span", attrs={"class": "NoCoR"}).contents[0]
選項串列示例
from bs4 import BeautifulSoup
html='''<span class='NoCoR'>3<!-- --> opiniones</span><span >486<!-- --> opiniones</span><span >730<!-- --> opiniones</span><span >283<!-- --> opiniones</span><span >468<!-- --> opiniones</span>'''
soup=BeautifulSoup(html,'html.parser')
for opinion in soup.select('span.NoCoR'):
print(opinion.contents[0])
輸出
3
486
730
283
468
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/344949.html
上一篇:美湯網爬取表資料
