我如何才能撰寫一個代碼,其中它為data.csv. 我想選擇第一個日期值作為唯一行并洗掉所有其他日期,就像它在預期輸出中的顯示方式一樣。3 行的日期值是2021-10-31 13:17:00它選擇第一個并忽略其余的日期值。我將如何能夠實作這一目標?
代碼:
data = pd.read_csv('data.csv')
dates= pd.to_datetime(data['Date'].to_list())
data.csv 檔案內容:
Date,Symbol,Open,High,Low,Close,Volume
2021-10-31 13:16:00,BTCUSD,60568.0,60640.0,60568.0,60638.0,3.9771881707839967
2021-10-31 13:15:00,BTCUSD,60620.0,60633.0,60565.0,60568.0,1.3977284440628714
2021-10-31 13:17:00,BTCUSD,60638.0,60640.0,60636.0,60638.0,0.4357009185659157
2021-10-31 13:17:00,BTCUSD,60648.0,60650.0,60638.0,60638.0,0.42475009155665155
2021-10-31 13:17:00,BTCUSD,60638.0,60640.0,60636.0,60638.0,0.4564009185659157
2021-10-31 13:16:00,BTCUSD,60588.0,60620.0,60510.0,60618.0,3.9771881707839967
預期輸出:
Date,Symbol,Open,High,Low,Close,Volume
2021-10-31 13:17:00,BTCUSD,60638.0,60640.0,60636.0,60638.0,0.4357009185659157
2021-10-31 13:16:00,BTCUSD,60568.0,60640.0,60568.0,60638.0,3.9771881707839967
2021-10-31 13:15:00,BTCUSD,60620.0,60633.0,60565.0,60568.0,1.3977284440628714
uj5u.com熱心網友回復:
我們先做sort_values然后drop_duplicates
out = df.sort_values('Date',ascending=False).drop_duplicates('Date')
Out[44]:
Date Symbol Open High Low Close Volume
2 2021-10-31 13:17:00 BTCUSD 60638.0 60640.0 60636.0 60638.0 0.435701
0 2021-10-31 13:16:00 BTCUSD 60568.0 60640.0 60568.0 60638.0 3.977188
1 2021-10-31 13:15:00 BTCUSD 60620.0 60633.0 60565.0 60568.0 1.397728
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/355483.html
