計算Python中資料框每個月的計數器-有解無憂

我有以下資料框：

cluster_ID	柜臺_1	計數器_2	日期
0	1	0	2021-01-02 10:00:00
0	1	2	2021-01-03 12:00:24
0	0	1	2021-01-04 09:10:30
0	2	1	2021-02-15 08:10:21
0	1	1	2021-03-04 14:23:43
1	2	0	2020-12-30 13:16:45
1	2	3	2021-01-07 12:13:23
1	1	2	2021-03-06 07:28:23
2	1	1	2021-01-10 14:24:23
2	1	0	2021-01-15 17:23:35
2	0	1	2021-01-20 13:28:13
2	1	2	2021-02-11 11:23:15
3	3	2	2021-04-13 21:14:19

我想定義一個函式，該函式生成一個新的資料框，該資料框包括表中每個現有月份的 2 個新列，用于從 counter_1 和 counter_2 資訊生成的日期列。對于按 cluster_ID 的每個組，每個月分別對每個計數器的列 counter_1 和 counter_2 求和。如果該月不存在任何值，則結果表應填寫為 0。日期的值是 Python 時間戳。

結果資料框的示例：

cluster_ID	counter_1_2020-12	counter_1_2021-01	counter_2_2021-01	counter_1_2021-02	counter_2_2021-02	counter_1_2021-03	counter_2_2021-03	counter_1_2021-04	counter_2_2021-04
0	0	2	3	2	1	1	1	0	0
1	2	2	3	0	0	1	2	0	0
2	0	2	2	1	2	0	0	0	0
3	0	0	0	0	0	0	0	3	2

我希望你能幫助我解決我的問題。我很感激你的幫助。

uj5u.com熱心網友回復：

您可以將日期時間轉換為字串，通過with進行YYYY-MM旋轉，按日期對列進行排序并按以下方式展平：DataFrame.pivot_tableaggfunc='sum'MultiIndexmap

df['date'] = pd.to_datetime(df['date'])

df1 = (df.assign(date = df['date'].dt.strftime('%Y-%m'))
         .pivot_table(index='cluster_ID', columns='date', fill_value=0, aggfunc='sum')
         .sort_index(level=[1,0], axis=1))
df1.columns = df1.columns.map(lambda x: f'{x[0]}_{x[1]}')
df1 = df1.reset_index()
print (df1)
   cluster_ID  counter_1_2020-12  counter_2_2020-12  counter_1_2021-01  \
0           0                  0                  0                  2   
1           1                  2                  0                  2   
2           2                  0                  0                  2   
3           3                  0                  0                  0   

   counter_2_2021-01  counter_1_2021-02  counter_2_2021-02  counter_1_2021-03  \
0                  3                  2                  1                  1   
1                  3                  0                  0                  1   
2                  2                  1                  2                  0   
3                  0                  0                  0                  0   

   counter_2_2021-03  counter_1_2021-04  counter_2_2021-04  
0                  1                  0                  0  
1                  2                  0                  0  
2                  0                  0                  0  
3                  0                  3                  2

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/451457.html

標籤：Python 熊猫数据框日期

上一篇：為什么DATE資料型別在H2中的Oracle模式下被視為TIMESTAMP(0)資料型別？

下一篇：從df獲取一個月的第一天和最后一天