代碼運行良好,甚至回圈遍歷所有頁面,但問題是它不會在最后一頁停止。從第 15 頁開始,它以連續回圈運行,即第 15 頁、第 16 頁、第 15 頁和第 16 頁,依此類推
from bs4 import BeautifulSoup as soup
import pandas as pd
import requests
import urllib
import requests, random
data =[]
def getdata (url):
user_agents = [
"chrome/5.0 (Windows NT 6.0; Win64; x64",
"chrome/5.0 (Windows NT 6.0; Win64; x32",
]
user_agent = random.choice(user_agents)
header_ = {'User-Agent': user_agent}
req = urllib.request.Request(url, headers=header_)
flipkart_html = urllib.request.urlopen(req).read()
f_soup = soup(flipkart_html,'html.parser')
for e in f_soup.select('div[]'):
try:
asin = e.find('a',{'class':'_1fQZEK'})['href'].split('=')[1].split('&')[0]
except:
asin = 'No ASIN Found'
data.append({
'ASIN': asin
})
return f_soup
def getnextpage(f_soup):
try:
page = f_soup.findAll('a',attrs={"class": '_1LKTO3'})[-1]['href']
url = 'https://www.flipkart.com' str(page)
except:
url = None
return url
keywords = ['iphone']
for k in keywords:
url = 'https://www.flipkart.com/search?q=' k
while True:
geturl = getdata(url)
url = getnextpage(geturl)
if not url:
break
print(url)
輸出

要終止回圈,請檢查下一個按鈕是否存在;如果它不存在則終止。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/459901.html
