我有一個這樣的df:
DATE PP
0 2011-12-20 07:00:00 0.0
1 2011-12-20 08:00:00 0.0
2 2011-12-20 09:00:00 2.0
3 2011-12-20 10:00:00 0.0
4 2011-12-20 11:00:00 0.0
5 2011-12-20 12:00:00 0.0
6 2011-12-20 13:00:00 0.0
7 2011-12-20 14:00:00 5.0
8 2011-12-20 15:00:00 0.0
9 2011-12-20 16:00:00 0.0
10 2011-12-20 17:00:00 2.0
11 2011-12-20 18:00:00 0.0
12 2011-12-20 19:00:00 0.0
13 2011-12-20 20:00:00 1.0
14 2011-12-20 21:00:00 0.0
15 2011-12-20 22:00:00 0.0
16 2011-12-20 23:00:00 0.0
17 2011-12-21 00:00:00 0.0
18 2011-12-21 01:00:00 3.0
19 2011-12-21 02:00:00 0.0
20 2011-12-21 03:00:00 0.0
21 2011-12-21 04:00:00 0.0
22 2011-12-21 05:00:00 0.0
23 2011-12-21 06:00:00 5.0
24 2011-12-21 07:00:00 0.0
... .... ... ...
75609 2020-08-05 16:00:00 0.0
75610 2020-08-05 19:00:00 0.0
[75614 rows x 2 columns]
PP我想要在不同日期的 2 個特定小時日期之間列的累積值。我想要從一天到第二天 07:00:00 的每個 07:00:00 的總和。例如,我想要從 2011-12-20 07:00:00 到 2011-12-21 07:00:00 的 PP 累積值:
預期結果:
DATE CUMULATIVE VALUES PP
0 2011-12-20 18
1 2011-12-21 5
2 2011-12-22 10
etc... etc... ...
我試過這個:
df['DAY'] = df['DATE'].dt.strftime('%d')
cumulatives=pd.DataFrame(df.groupby(['DAY'])['PP'].sum())
但這只是一整天的總和,而不是 7:00:00 到 7:00:00 之間的天數。
資料:
{'DATE': ['2011-12-20 07:00:00', '2011-12-20 08:00:00', '2011-12-20 09:00:00',
'2011-12-20 10:00:00', '2011-12-20 11:00:00', '2011-12-20 12:00:00',
'2011-12-20 13:00:00', '2011-12-20 14:00:00', '2011-12-20 15:00:00',
'2011-12-20 16:00:00', '2011-12-20 17:00:00', '2011-12-20 18:00:00',
'2011-12-20 19:00:00', '2011-12-20 20:00:00', '2011-12-20 21:00:00',
'2011-12-20 22:00:00', '2011-12-20 23:00:00', '2011-12-21 00:00:00',
'2011-12-21 01:00:00', '2011-12-21 02:00:00', '2011-12-21 03:00:00',
'2011-12-21 04:00:00', '2011-12-21 05:00:00', '2011-12-21 06:00:00',
'2011-12-21 07:00:00', '2020-08-05 16:00:00', '2020-08-05 19:00:00'],
'PP': [0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 5.0, 0.0, 0.0, 2.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 5.0, 0.0, 0.0, 0.0]}
uj5u.com熱心網友回復:
一種方法是從日期中減去 7 小時,以便每天從前一天的 17:00 開始;然后groupby.sum獲取所需的輸出:
df['DATE'] = pd.to_datetime(df['DATE'])
out = df.groupby(df['DATE'].sub(pd.to_timedelta('7h')).dt.date)['PP'].sum().reset_index(name='SUM')
輸出:
DATE SUM
0 2011-12-20 18.0
1 2011-12-21 0.0
2 2020-08-05 0.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/465165.html
