我正在嘗試從 coinmarketcap.com 抓取歷史位元幣資料,以獲得從年初到 2021 年 9 月 30 日的收盤價、交易量、日期、最高和最低值。經過幾個小時的執行緒和視頻后,我我是 Python 抓取的新手,我不知道我的錯誤是什么(或者我沒有檢測到網站上有什么東西?)。以下是我的代碼:
from bs4 import BeautifulSoup
import requests
import pandas as pd
closeList = []
volumeList = []
dateList = []
highList = []
lowList = []
website = 'https://coinmarketcap.com/currencies/bitcoin/historical-data/'
r = requests.get(website)
r = requests.get(website)
soup = BeautifulSoup(r.text, 'lxml')
tr = soup.find_all('tr')
FullData = []
for item in tr:
closeList.append(item.find_all('td')[4].text)
volumeList.append(item.find_all('td')[5].text)
dateList.append(item.find('td',{'style':'text-align: left;'}).text)
highList.append(item.find_all('td')[2].text)
lowList.append(item.find_all('td')[3].text)
FullData.append([closeList,volumeList,dateList,highList,lowList])
df_columns = ["close", "volume", "date", "high", "low"]
df = pd.DataFrame(FullData, columns = df_columns)
print(df)
結果我只得到:
Empty DataFrame
Columns: [close, volume, date, high, low]
Index: []
該任務迫使我使用 BeautifulSoup 進行抓取,然后匯出到 csv(顯然這只是 df.to_csv - 有人可以幫助我嗎?我將不勝感激。
uj5u.com熱心網友回復:
實際上,資料是由 javascript 從 api 呼叫 json 回應動態加載的。因此,您可以輕松地抓取資料,如下所示:
代碼:
import requests
import json
import pandas as pd
api_url= 'https://api.coinmarketcap.com/data-api/v3/cryptocurrency/historical?id=1&convertId=2781&timeStart=1632441600&timeEnd=1637712000'
r = requests.get(api_url)
data = []
for item in r.json()['data']['quotes']:
close = item['quote']['close']
volume =item['quote']['volume']
date=item['quote']['timestamp']
high=item['quote']['high']
low=item['quote']['low']
data.append([close,volume,date,high,low])
cols = ["close", "volume","date","high","low"]
df = pd.DataFrame(data, columns= cols)
print(df)
#df.to_csv('info.csv',index = False)
輸出:
close volume date high low
0 42839.751696 4.283935e 10 2021-09-24T23:59:59.999Z 45080.491063 40936.557169
1 42716.593147 3.160472e 10 2021-09-25T23:59:59.999Z 42996.259704 41759.920425
2 43208.539105 3.066122e 10 2021-09-26T23:59:59.999Z 43919.300970 40848.461660
3 42235.731847 3.098003e 10 2021-09-27T23:59:59.999Z 44313.245882 42190.632576
4 41034.544665 3.021494e 10 2021-09-28T23:59:59.999Z 42775.146142 40931.662500
.. ... ... ... ... ...
56 58119.576194 3.870241e 10 2021-11-19T23:59:59.999Z 58351.113266 55705.180685
57 59697.197134 3.062426e 10 2021-11-20T23:59:59.999Z 59859.880442 57469.725661
58 58730.476639 2.612345e 10 2021-11-21T23:59:59.999Z 60004.426383 58618.931432
59 56289.287323 3.503612e 10 2021-11-22T23:59:59.999Z 59266.358468 55679.840404
60 57569.074876 3.748580e 10 2021-11-23T23:59:59.999Z 57875.516397 55632.759912
[61 rows x 5 columns]
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/365325.html
上一篇:如何從p標簽中查找元素
