我需要使用美麗的湯決議媒體上文章的“發布日期”。我成功地決議了回圈作者、標題、閱讀時間,但由于某種原因“發布日期”對我不起作用。
這是示例:
https://medium.com/interlay/archive/2020
所以 prasing 的輸出將是 Jun 18, 2020 ; Mar 5 , 2020 ; Feb 23, 2020 etc.
uj5u.com熱心網友回復:
日期存在于<time>每篇文章的標簽中<div>。
選擇該<time>標簽并列印其文本。
這是代碼。
import requests
from bs4 import BeautifulSoup
url = 'https://medium.com/interlay/archive/2020'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
t = [x.text.strip() for x in soup.find_all('time')]
print(t)
['Jun 18, 2020', 'Mar 4, 2020', 'Feb 23, 2020', 'Nov 30, 2020', 'Apr 15, 2020', 'Aug 21, 2020', 'Oct 27, 2020']
uj5u.com熱心網友回復:
import requests
from bs4 import BeautifulSoup
url='https://medium.com/interlay/archive/2020'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
您可以main_div從它的類中找到標簽并回圈遍歷它以從time標簽中獲取資料
main_div=soup.find_all("div",class_="streamItem streamItem--postPreview js-streamItem")
for div in main_div:
print(div.find("time").text)
輸出:
Jun 18, 2020
Mar 4, 2020
Feb 23, 2020
Nov 30, 2020
Apr 15, 2020
Aug 21, 2020
Oct 27, 2020
?
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/350656.html
