本人想爬一下某寶女裝雪紡裙的所有圖片
#http://s.taobao.com/list?spm=a217f.8051907.312344.11.4aca33080R6p5b&q=雪紡裙&cat=16&style=grid&seller_type=taobao&bcoffset=0&s=0
import urllib.request
import re
keywd="雪紡裙"
keyname=urllib.request.quote(keywd)
headers=("user-agent","Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36")
opener=urllib.request.build_opener()
opener.addheaders=[headers]
urllib.request.install_opener(opener)
for i in range(1,5):
url="http://s.taobao.com/list spm=a217f.8051907.312344.11.4aca33080R6p5b&q="+keyname+"&cat=16&style=grid&seller_type=taobao&bcoffset=0&s="+str(i*60-1)
data=https://bbs.csdn.net/topics/urllib.request.urlopen(url).read().decode("utf-8","ignore")
pat='pic_url":"//(.*?)"'
result=re.compile(pat).findall(data)
for j in range(0,len(result)):
thisresult=result[j]
thisurl="http://"+thisresult
file="F:/22/img/"+str(i)+str(j)+".jpg"
urllib.request.urlretrieve(thisurl,filename=file)
運行的時候也沒有出錯,后來自己一步一步嘗試的時候發現在第二個回圈之前,我print(result)的時候發現串列是空的,但是感覺正則也沒寫錯,所以求各位大神看看是啥原因!
uj5u.com熱心網友回復:
try一波,錯誤可以跳過去轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/115255.html
