爬取https://www.turners.co.nz/Cars/Used-Cars-for-Sale網站上的圖片資料,翻頁時出現問題,無論設定頁數為多少,都只會回傳第一頁的結果。代碼如下
# -*- coding:utf8 -*-
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36'}
def RequestWithPageno(pageno=1):
form_data = {
'sortorder': 0,
'pagesize': 24,
'pageno': pageno}
# url = 'https://www.turners.co.nz/Cars/Used-Cars-for-Sale/?sortorder=0&pagesize=24&pageno='+str(pageno)
response = requests.get(url, headers=headers, data=https://bbs.csdn.net/topics/form_data)
soup = BeautifulSoup(response.text, 'html.parser')
divs = soup.select('#searchResultsContainer > div')
for div in divs:
#顯示第一個車輛的資訊即可
goodnumber = div.get("data-goodnumber")
print(goodnumber)
break
# url = 'https://www.turners.co.nz/Cars/Used-Cars-for-Sale/?sortorder=0&pagesize=24&pageno=1'
url = 'https://www.turners.co.nz/Cars/Used-Cars-for-Sale'
if __name__ == "__main__":
pageno = 1
while True:
print(pageno)
RequestWithPageno(pageno=pageno)
pageno += 1
解決問題程序中發現以下情況:
瀏覽器操作,第一頁的網址是https://www.turners.co.nz/Cars/Used-Cars-for-Sale/?sortorder=0&pagesize=24&pageno=1,點擊頁面上的翻頁標簽能正常跳轉,翻頁后的網址是https://www.turners.co.nz/Cars/Used-Cars-for-Sale/?sortorder=0&pagesize=24&pageno=2;
如果直接在瀏覽器中輸入網址https://www.turners.co.nz/Cars/Used-Cars-for-Sale/?sortorder=0&pagesize=24&pageno=2,則會自動跳轉至第一頁。
求大佬指點!!!
uj5u.com熱心網友回復:
不用python+selenium該怎么解決???轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/252800.html
