如何從具有多個屬性的元素中抓取文本?
<h2 class="_63-j _1rimQ" data-qa="heading">Popular Dishes</h2>
我用這個
category = soup.find(name="h2", attrs={"class":"_63-j _1rimQ","data-qa":"heading"}).getText()
但它回傳一個錯誤
AttributeError: 'NoneType' object has no attribute 'getText'
使用這個時回傳相同的錯誤
category = soup.find(name="h2",class_="_63-j _1rimQ")
uj5u.com熱心網友回復:
您希望從該頁面獲取的內容是動態生成的,因此 BeautifulSoup 不會幫助您獲取它們。請求正在發送到端點。以下是使用請求實作的方法:
import requests
link = 'https://cw-api.takeaway.com/api/v29/restaurant'
params = {
'slug': 'c-pizza-c-kebab'
}
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36',
'x-requested-with': 'XMLHttpRequest',
'x-country-code': 'fr',
}
with requests.Session() as s:
s.headers.update(headers)
res = s.get(link,params=params)
container = res.json()['menu']['products']
for key,val in container.items():
print(val['name'])
輸出(截斷):
Kebab veau
Pot de kebabs
Pot de frites
Margherita
Bambino
Reine
Sicilienne
Végétarienne
Calzone soufflée jambon
Calzone soufflée b?uf haché
Pêcheur
uj5u.com熱心網友回復:
from bs4 import BeautifulSoup as bs
html = """<h2 data-qa="heading">Popular Dishes</h2>"""
soup = bs(html, 'html.parser')
soup.find('h2', class_ = '_63-j _1rimQ').getText() # 'Popular Dishes'
在這里作業得很好。也許'html.parser'?
BeautifulSoup 4.10.0,Python 3.10.2
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/456923.html
