想問一下,關于js ajax動態渲染問題
我在爬取堆糖(https://www.duitang.com/topics/#!hot-p3)和求職網(https://search.51job.com/list/000000,000000,0000,00,9,99,ui,2,1.html)發現的問題
兩個網站都是js動態渲染的(關閉js都沒有內容) ,同樣的爬取代碼,堆糖可以列印出包含渲染后內容的網頁原始碼,但是求職網無法列印出完整的網頁,這是什么原因(ajax異步同步?)
怎么爬取求職網完整網頁(selenium還沒學)
代碼如下:
url = 'https://www.duitang.com/topics/#!hot-p3'
head = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
" AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/88.0.4324.96 Safari/537.36 Edg/88.0.705.50"
}
request = urllib.request.Request(url,headers=head)
html = ""
try:
response = urllib.request.urlopen(request)
html = response.read().decode('utf-8')
print(html)
except urllib.error.URLError as e:
if hasattr(e,"code"):
print(e.code)
if hasattr(e,"reason"):
print(e.reason)
uj5u.com熱心網友回復:
這是爬求職網的代碼,幾乎無變化url = 'https://search.51job.com/list/000000,000000,0000,00,9,99,ui,2,1.html'
head = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
" AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/88.0.4324.96 Safari/537.36 Edg/88.0.705.50"
}
request = urllib.request.Request(url,headers=head)
html = ""
try:
response = urllib.request.urlopen(request)
html = response.read().decode('gbk')
print(html)
except urllib.error.URLError as e:
if hasattr(e,"code"):
print(e.code)
if hasattr(e,"reason"):
print(e.reason)
uj5u.com熱心網友回復:
頂下zsbd轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/269199.html
標籤:Ajax
上一篇:求個能一鍵勾選全部包的wpe工具
下一篇:請教一個分析json資料的問題
