目標:用Scrapy爬取糗事百科段子的作者和內容 (要用scrapy , 別的不要。)
https://www.qiushibaike.com/text/
我第10行的XPATH路徑估計是寫錯了,列印出來,啥也沒有,也沒報錯。請大神更正!
另外14行,15行的XPAHT路徑是不是也錯了?
import scrapy
class QiubaiSpider(scrapy.Spider):
name = 'qiubai'
#allowed_domains = ['www.xxx.com']
start_urls = ['https://www.qiushibaike.com/text/']
def parse(self, response):
div_list = response.xpath('//div[@class="col1 old-style-col1"]/div')
print(div_list)
for div in div_list:
#authtor = div.xpath('./div[@class="author clearfix"]/a[2]/h2/text()')[0].extract()
author = div.xpath('./div[1]/a[2]/h2/text()')[0].extract()
content = div.xpath('./a[1]/div/span//text()').extract()
print(author, content)
break
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/196155.html
