我正在嘗試獲取英超聯賽網站的資料:https : //www.premierleague.com/clubs/4/club/stats?se=15
我的問題是當我從上面提到的網站獲取資料時,我從這個網站獲取資料:https : //www.premierleague.com/clubs/4/club/stats
所以資料和 URL 在過濾到不同的季節后會發生變化,但當我試圖從網站上獲取它時似乎沒有變化。
我的代碼:
from bs4 import BeautifulSoup
import requests
import numpy as np
ChelseaReq = requests.get("https://www.premierleague.com/clubs/4/club/stats?se=15")
ChelseaData = ChelseaReq.text
soup = BeautifulSoup(ChelseaData, "html.parser")
dataSet = np.array([])
dataSet1 = np.array([])
chelsea_db = {}
for stattext in soup.find_all("div",class_ ="normalStat"):
chelsea_stat_numbers = stattext.span.text.split()[-1]
chelsea_stat_numbers = chelsea_stat_numbers.replace(',','')
chelsea_stat_numbers = chelsea_stat_numbers.replace('%','')
dataSet = np.append(dataSet,float(chelsea_stat_numbers))
chelsea_stat_attributes = ','.join(stattext.span.text.split()[0:-1])
chelsea_stat_attributes = chelsea_stat_attributes.replace(',',' ')
dataSet1 = np.append(dataSet1,chelsea_stat_attributes)
for A,B in zip(dataSet1,dataSet):
chelsea_db[A] = B
chelsea_db
這將列印總資料而不是過濾后的資料。我將如何更改它以回傳過濾后的資料?
例如:
current output =
'Goals': 1936.0,
'Goals per match': 1.71,
'Shots': 9954.0, ... etc
(after filtering the data on the website's filter button to a single season)
my goal =
'Goals': 36,
'Goals per match': 1.71,
'Shots': 160, ... etc
uj5u.com熱心網友回復:
您不會得到過濾資料,因為這些資料是由 Javascript 使用 XHR 請求加載的。但是您可以直接發送此請求并以 JSON 格式獲取所有需要的資料。所以你甚至不需要使用BeautifulSoup. 這是代碼示例:
import requests
import json
headers = {
'origin': 'https://www.premierleague.com', # your get 403 Forbidden without this header
}
params = {
"comps": 1,
"compSeasons": 15 # number of season
}
chelsea_season_data = requests.get("https://footballapi.pulselive.com/football/stats/team/4",
params=params, headers=headers)
data = json.loads(chelsea_season_data.text)
for stat in data['stats']:
if stat['name'] == 'wins':
print(f"Wins: {stat['value']}")
elif stat['name'] == 'losses':
print(f"Losses: {stat['value']}")
elif stat['name'] == 'goals':
print(f"Goals: {stat['value']}")
elif stat['name'] == 'goals_conceded':
print(f"Goals conceded: {stat['value']}")
elif stat['name'] == 'clean_sheet':
print(f"Clean sheets: {stat['value']}")
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/394766.html
上一篇:For回圈有很多不同的URL
