我有多個帶有 column = 的資料框['Date', 'HE', 'Prices']。這些資料框具有不同的開始日期和共同的結束日期。我想將我的資料幀 acc 修剪為所有資料幀中的最大開始日期。
import pandas as pd
import numpy as np
dateRange1 = pd.date_range('01/01/2020 04:00', '12/31/2020 23:00', freq='1H')
dateRange2 = pd.date_range('01/09/2020 08:00', '12/31/2020 23:00', freq='1H')
dateRange3 = pd.date_range('01/15/2020 11:00', '12/31/2020 23:00', freq='1H')
dic1 = {
'df1': {
'Date': dateRange1.date,
'HE': dateRange1.hour,
'Data': np.random.randint(1,100,len(dateRange1))
},
'df2': {
'Date': dateRange2.date,
'HE': dateRange2.hour,
'Data': np.random.randint(1,100,len(dateRange2))
},
'df3': {
'Date': dateRange3.date,
'HE': dateRange3.hour,
'Data': np.random.randint(1,100,len(dateRange3))
}
}
dfList = {}
for i in dic1.keys():
dfList[i] = pd.DataFrame(dic1[i])
startDate = max([dic1[i]['Date'][0] for i in dic1.keys()]) #this will give us the max
start date which is '01/15/2020 11:00'
finalDateRange = pd.date_range(startDate ' 00:00', '31/12/2020 23:00', freq='1H')
newDFList = {}
for i in dfList.keys():
newDF = dfList[i].iloc[len(finalDateRange):, :] #Here I am trying to trim that dataframe
but the results are not as expected
print(newDF)
輸出:
Date HE Data
8424 2020-12-17 0 36
8425 2020-12-17 1 81
... ... .. ...
8782 2020-12-31 22 66
8783 2020-12-31 23 87
[360 rows x 3 columns]
Date HE Data
8424 2020-12-27 0 30
8425 2020-12-27 1 84
... ... .. ...
8542 2020-12-31 22 17
8543 2020-12-31 23 21
[120 rows x 3 columns]
Date HE Data
8424 2020-12-31 0 76
8425 2020-12-31 1 38
... ... .. ..
8446 2020-12-31 22 75
8447 2020-12-31 23 29
預期的輸出應該是這樣的,新的開始日期應該是最大開始日期的第二天('01/15/2020 11:00'):
Date HE Data
0 2020-01-16 0 61
1 2020-01-16 1 47
... ... .. ...
8422 2020-12-31 22 22
8423 2020-12-31 23 80
[8424 rows x 3 columns]
Date HE Data
0 2020-01-16 0 43
1 2020-01-16 1 22
... ... .. ...
8422 2020-12-31 22 71
8423 2020-12-31 23 63
[8424 rows x 3 columns]
Date HE Data
0 2020-01-16 0 92
1 2020-01-16 1 46
... ... .. ...
8422 2020-12-31 22 77
8423 2020-12-31 23 46
[8424 rows x 3 columns]
uj5u.com熱心網友回復:
我想你想要:
newDFList = {}
# use `items()` iterator for both keys and values
for i,d in dfList.items():
# notice the minus sign
newDFList[i] = d.iloc[-len(finalDateRange):]
或者您可以(更好地)使用布爾索引:
newDFList = {}
for i,d in dfList.items():
# only keep dates >= startDate
newDFList[i] = d.loc[d['Date'] > startDate]
獎金使用上飛字典迭代:
newDFList = {i: d.loc[d['Date'] > startDate]
for i, d in dfList.items()
}
然后,for d in newDFList.values(): print(d)給你:
Date HE Data
356 2020-01-16 0 90
357 2020-01-16 1 91
358 2020-01-16 2 75
359 2020-01-16 3 68
360 2020-01-16 4 74
... ... .. ...
8775 2020-12-31 19 67
8776 2020-12-31 20 83
8777 2020-12-31 21 55
8778 2020-12-31 22 70
8779 2020-12-31 23 61
[8424 rows x 3 columns]
Date HE Data
160 2020-01-16 0 59
161 2020-01-16 1 40
162 2020-01-16 2 98
163 2020-01-16 3 67
164 2020-01-16 4 78
... ... .. ...
8579 2020-12-31 19 76
8580 2020-12-31 20 11
8581 2020-12-31 21 88
8582 2020-12-31 22 79
8583 2020-12-31 23 82
[8424 rows x 3 columns]
Date HE Data
13 2020-01-16 0 52
14 2020-01-16 1 69
15 2020-01-16 2 44
16 2020-01-16 3 75
17 2020-01-16 4 67
... ... .. ...
8432 2020-12-31 19 35
8433 2020-12-31 20 39
8434 2020-12-31 21 56
8435 2020-12-31 22 83
8436 2020-12-31 23 60
[8424 rows x 3 columns]
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/343283.html
上一篇:如何撰寫僅匹配DateTime列的分鐘欄位的Django查詢?
下一篇:SQL-在日期桶中放置值
