我正在嘗試基于相同的 url 格式獲取三個州的資料。
states = ['123', '124', '125']
urls = []
for state in states:
url = f'www.something.com/geo={state}'
urls.append(url)
從那里我有三個單獨的網址,每個網址都包含不同的狀態 ID。
但是,當我通過 BS 處理它時,輸出僅顯示來自狀態 123 的資料。
for url in urls:
client = ScrapingBeeClient(api_key="API_KEY")
response = client.get(url)
doc = BeautifulSoup(response.text, 'html.parser')
隨后我使用這個提取了我想要的列:
listings = doc.select('.is-9-desktop')
rows = []
for listing in listings:
row = {}
try:
row['name'] = listing.select_one('.result-title').text.strip()
except:
print("no name")
try:
row['add'] = listing.select_one('.address-text').text.strip()
except:
print("no add")
try:
row['mention'] = listing.select_one('.review-mention-block').text.strip()
except:
pass
rows.append(row)
但如前所述,它只顯示狀態 123 的資料。如果有人能讓我知道我哪里出錯了,非常感謝,謝謝!
編輯
我將 URL 輸出添加到串列中,并且能夠獲取所有三個狀態的資料。
doc = []
for url in urls:
client = ScrapingBeeClient(api_key="API_KEY")
response = client.get(url)
docs = BeautifulSoup(response.text, 'html.parser')
doc.append(docs)
但是,當我通過 BS 運行它時,它導致了錯誤訊息:
屬性錯誤:“串列”物件沒有屬性選擇。
我是否通過另一個回圈運行它?
uj5u.com熱心網友回復:
它不需要所有這些回圈 - 只需遍歷狀態并將串列附加到行。
最重要的是rows=[]放在 for 回圈之外以阻止它覆寫自身。
例子
states = ['123', '124', '125']
rows = []
for state in states:
url = f'www.something.com/geo={states}'
client = ScrapingBeeClient(api_key="API_KEY")
response = client.get(url)
doc = BeautifulSoup(response.text, 'html.parser')
listings = doc.select('.is-9-desktop')
for listing in listings:
row = {}
try:
row['name'] = listing.select_one('.result-title').text.strip()
except:
print("no name")
try:
row['add'] = listing.select_one('.address-text').text.strip()
except:
print("no add")
try:
row['mention'] = listing.select_one('.review-mention-block').text.strip()
except:
pass
rows.append(row)
uj5u.com熱心網友回復:
如評論中所述,您的代碼中存在一些錯誤。嘗試更改此版本。
states = ['123', '124', '125']
urls = []
for state in states:
url = f'www.something.com/geo={state}'
urls.append(url)
rows = []
for url in urls:
client = ScrapingBeeClient(api_key="API_KEY")
response = client.get(url)
doc = BeautifulSoup(response.text, 'html.parser')
listings = doc.select('.is-9-desktop')
for listing in listings:
row = {}
try:
row['name'] = listing.select_one('.result-title').text.strip()
except:
print("no name")
try:
row['add'] = listing.select_one('.address-text').text.strip()
except:
print("no add")
try:
row['mention'] = listing.select_one('.review-mention-block').text.strip()
except:
pass
rows.append(row)
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/377788.html
