您好,我想計算每個名稱組在每個時間的平均天數列。每組每個時間的平均值應僅基于大于 0 的天數行的元素計算。任何反饋將不勝感激..
Name Time Days Average
John 2021-12-02 0 0
John 2021-12-03 2 0
John 2021-12-05 9 2
John 2021-12-07 0 5.5
John 2021-12-10 10 5.5
Larry 2021-12-02 20 0
Jim 2021-12-09 20 0
Jim 2021-12-10 20 20
Jim 2021-12-12 40 20
Jim 2021-12-12 0 26.6
Juli 2021-11-09 0 0
Juli 2021-11-10 0 0
Juli 2021-11-12 40 0
Juli 2021-11-18 0 40
Juli 2021-11-12 0 40
Juli 2021-11-18 2 40
Juli 2021-11-19 0 21
uj5u.com熱心網友回復:
首先替換0為缺失值,然后將GroupBy.transformwith lambda 函式用于Series.expandingwith meanand Series.shift,最后將NaNs 替換為0by Series.fillna:
df['Avg'] = (df.assign(Days = df['Days'].replace(0,np.nan))
.groupby('Name')['Days']
.transform(lambda x: x.expanding().mean().shift())
.fillna(0))
print (df)
Name Time Days Average Avg
0 John 2021-12-02 0 0.0 0.000000
1 John 2021-12-03 2 0.0 0.000000
2 John 2021-12-05 9 2.0 2.000000
3 John 2021-12-07 0 5.5 5.500000
4 John 2021-12-10 10 5.5 5.500000
5 Larry 2021-12-02 20 0.0 0.000000
6 Jim 2021-12-09 20 0.0 0.000000
7 Jim 2021-12-10 20 20.0 20.000000
8 Jim 2021-12-12 40 20.0 20.000000
9 Jim 2021-12-12 0 26.6 26.666667
10 Juli 2021-11-09 0 0.0 0.000000
11 Juli 2021-11-10 0 0.0 0.000000
12 Juli 2021-11-12 40 0.0 0.000000
13 Juli 2021-11-18 0 40.0 40.000000
14 Juli 2021-11-12 0 40.0 40.000000
15 Juli 2021-11-18 2 40.0 40.000000
16 Juli 2021-11-19 0 21.0 21.000000
uj5u.com熱心網友回復:
首先確保“天”列是數字。然后:
df.loc[df.Days>0].groupby(["Name","Time"]).mean()
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/421488.html
標籤:
