我與客戶消費合作,有時一個月或更長時間沒有這種消費,所以之后的第一次消費需要將其分解為那些月的例子
df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})
bfill() 回傳相同的值不是平均值(null 1 的值/計數)期望值
'c':[100,130,133,133,133,140,105,500,550,550,550,550,550,550]
uj5u.com熱心網友回復:
你可以嘗試這樣的事情:
df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})
df['grp'] = df['consumption'].ne(0)[::-1].cumsum()
df['c'] = df.groupby(['customerId', 'grp'])['consumption'].transform('mean')
df
輸出:
customerId month consumption grp c
0 1 2021-10-01 100 7 100.000000
1 1 2021-11-01 130 6 130.000000
2 1 2021-12-01 0 5 133.333333
3 1 2022-01-01 0 5 133.333333
4 1 2022-02-01 400 5 133.333333
5 1 2022-03-01 140 4 140.000000
6 1 2022-04-01 105 3 105.000000
7 2 2021-10-01 500 2 500.000000
8 2 2021-11-01 0 1 550.000000
9 2 2021-12-01 0 1 550.000000
10 2 2022-01-01 0 1 550.000000
11 2 2022-02-01 0 1 550.000000
12 2 2022-03-01 0 1 550.000000
13 2 2022-04-01 3300 1 550.000000
細節:
- 通過檢查零來創建一個組,以相反的順序執行一個 cumsum 以將零與下一個非零值分組。
- Groupby 該組和變換意味著將該非零值分布在零之間。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/438693.html
標籤:python-3.x 熊猫 数据框
