我正在嘗試使用scrapy從谷歌金融中抓取股票價格。代碼沒有顯示任何錯誤,但輸出檔案是空白的。
粘貼以下代碼:
import scrapy
bse_list=['quote/ABB:NSE','quote/AEGISLOG:NSE','quote/AMARAJABAT:NSE','quote/AMBALALSA:NSE','quote/HDFC:NSE','quote/ANDHRAPET:NSE','quote/ANSALAPI:NSE']
class CrawlSpider(scrapy.Spider):
name = 'crawl'
allowed_domains = ['www.google.com/finance/']
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback = self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@]/text()').extract_first()
current_price = response.xpath('//*[@]/text()').extract_first()
stock_info = response.xpath('//*[@]/text()').extract()
last_closing_price = stock_info[0]
day_range = stock_info[1]
year_range = stock_info[2]
market_cap = stock_info[3]
p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
"last_closing_price": last_closing_price,
"day_range": day_range,
"year_range": year_range,
"market_cap": market_cap,
"p_e_ratio": p_e_ratio
}
uj5u.com熱心網友回復:
問題在于股票資訊選擇,其余代碼作業正常。
import scrapy
bse_list = ['quote/ABB:NSE', 'quote/AEGISLOG:NSE', 'quote/AMARAJABAT:NSE',
'quote/AMBALALSA:NSE', 'quote/HDFC:NSE', 'quote/ANDHRAPET:NSE', 'quote/ANSALAPI:NSE']
class CrlSpider(scrapy.Spider):
name = 'crl'
start_urls = ['https://google.com/finance/']
def parse(self, response):
for stock in bse_list:
url_new = response.urljoin(stock)
yield scrapy.Request(url_new, callback=self.parse_book)
def parse_book(self, response):
stock_name = response.xpath('//*[@]/text()').extract_first()
current_price = response.xpath('//*[@]/text()').extract_first()
#stock_info = response.xpath('//*[@]/text()').extract()
#last_closing_price = stock_info[0]
# day_range = stock_info[1]
# year_range = stock_info[2]
# market_cap = stock_info[3]
# p_e_ratio = stock_inf[4]
yield {
"stock_name": stock_name,
"current_price": current_price,
#"last_closing_price": last_closing_price,
# "day_range": day_range,
# "year_range": year_range,
# "market_cap": market_cap,
# "p_e_ratio": p_e_ratio
}
輸出:
{'stock_name': 'Ansal Properties and Infrastructure Ltd', 'current_price': '?13.30'}
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ANDHRAPET:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMBALALSA:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AEGISLOG:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ABB:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/HDFC:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMARAJABAT:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ANDHRAPET:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMBALALSA:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AEGISLOG:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ABB:NSE>
{'stock_name': 'ABB India Ltd', 'current_price': '?2,139.00'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/HDFC:NSE>
{'stock_name': 'Housing Development Finance Corp Ltd', 'current_price': '?2,994.15'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMARAJABAT:NSE>
{'stock_name': 'Amara Raja Batteries Ltd', 'current_price': '?685.40'}
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/361675.html
上一篇:使用BeautifulSoup抓取-無組織串列中的串列
下一篇:Scrapy不遵循新的請求
