淘寶圖片爬蟲不成功,求原因,程式不報錯,但是走到正則那里為空了 求解
#淘寶商品圖片爬蟲
import urllib.request
import re
import random
keyname="連衣裙"
key=urllib.request.quote(keyname)
uapools=[
"Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko",
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3314.0 Safari/537.36 SE 2.X MetaSr 1.0",
"Mozilla/5.0 (Windows; U; Windows NT 6.1; ) AppleWebKit/534.12 (KHTML, like Gecko) Maxthon/3.0 Safari/534.12"
]
def ua(uapools):
thisua=random.choice(uapools)
print(thisua)
headers=("User-Agent",thisua)
opener=urllib.request.build_opener()
opener.addheaders=[headers]
#安裝為全域
urllib.request.install_opener(opener)
for i in range(1,101):
url="https://s.taobao.com/search?q="+key+"&s="+str((i-1)*44)
ua(uapools)
data=https://bbs.csdn.net/topics/urllib.request.urlopen(url).read().decode("utf-8","ignore")
pat='pic_url":"//(.*?)"'
imglist=re.compile(pat).findall(data)
print(imglist)
for j in range(0,len(imglist)):
thisimg=imglist[j]
thisimgurl="http://"+thisimg
lockfile="D:/Python練習/淘寶圖片/"+str(i)+str(j)+".jpg"
urllib.request.urlretrieve(thisimgurl,filename=localfile)
uj5u.com熱心網友回復:
正則寫錯了吧uj5u.com熱心網友回復:
你這個data跟原資料也不一樣。uj5u.com熱心網友回復:
正則運算式有問題,為啥是在一對單引號里面有三個雙引號uj5u.com熱心網友回復:
正則運算式有誤,urllib庫不好用。我一般用requestsuj5u.com熱心網友回復:
我也遇到同樣的問題uj5u.com熱心網友回復:
我也遇到同樣的問題轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/259881.html
