我試圖找出一個正則運算式,它將顯示股票新聞提要的標題。
這是我到目前為止的代碼,正則運算式的特殊字符是“<title.*?</”:
def yahoo_hl(ticker):
import re, requests
headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0"}
xml = requests.get(f'https://feeds.finance.yahoo.com/rss/2.0/headline?s={ticker}', headers=headers).text
news_headlines = re.findall(r'<title.*?</', xml, re.DOTALL) # put your regular expression between the single quotes
return news_headlines
當我運行它時,它會顯示以下輸出,除了“<title>”和每個標題開頭和結尾的“</”字符外,還顯示標題:
['<title>Yahoo! Finance: TSLA News</',
'<title>Tesla Is About to Start Production at Its Berlin Gigafactory</',
'<title>Tesla CEO Elon Musk Wants the U.S. and the World to Pump More Oil</',
'<title>Tesla Gets Stronger With Oil Rising, Other EV Stocks Not So Much</',
'<title>What Is The Boring Company?</']
目標是洗掉“<title>”和“<”以輸出如下標題:
['Yahoo! Finance: TSLA News',
'Tesla Is About to Start Production at Its Berlin Gigafactory',
'Tesla CEO Elon Musk Wants the U.S. and the World to Pump More Oil',
'Tesla Gets Stronger With Oil Rising, Other EV Stocks Not So Much',
'What Is The Boring Company?']
任何幫助,將不勝感激。先感謝您。
uj5u.com熱心網友回復:
您可以在正則運算式中創建一個“捕獲組”:
import re, requests
def yahoo_hl(ticker):
headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0"}
xml = requests.get(f'https://feeds.finance.yahoo.com/rss/2.0/headline?s={ticker}', headers=headers).text
news_headlines = re.findall(r'<title>(.*?)</title', xml, re.DOTALL)
return news_headlines
print(*yahoo_hl('TSLA'), sep='\n') # yahoo_hl('TSLA') is the list you want
輸出:
Yahoo! Finance: TSLA News
Tesla Is About to Start Production at Its Berlin Gigafactory
Tesla CEO Elon Musk Wants the U.S. and the World to Pump More Oil
Tesla Gets Stronger With Oil Rising, Other EV Stocks Not So Much
What Is The Boring Company?
...
您可以在檔案中找到相關資訊:
結果取決于模式中捕獲組的數量。如果沒有組,則回傳與整個模式匹配的字串串列。如果只有一個組,則回傳與該組匹配的字串串列。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/438687.html
標籤:Python python-3.x
