我有一個這樣的資料框:
df = pd.DataFrame({
'ref1': [1,1,3,7,7],
'ref2': [1,2,1,1,2],
'value': [1,2,3,5,6],
})
df
ref1 ref2 value
0 1 1 1
1 1 2 2
2 3 1 3
3 7 1 5
4 7 2 6
我想添加列new_value并獲得這個:
ref1 ref2 value new_value my_comment
0 1 1 1 NaN no prev ref1
1 1 2 2 NaN no prev ref1
2 3 1 3 1.0
3 7 1 5 3.0
4 7 2 6 NaN no same ref2 @ ref1==3
遵循這些規則:
new_value是value為同ref2和以前的ref1(像一個有序串列[1,3,7]),否則NaN
uj5u.com熱心網友回復:
鑒于1, 3, 7按順序排列,您可以旋轉,移位,堆疊回來以獲得移位的值,然后合并:
df.merge(df.pivot(index='ref1', columns='ref2', values='value')
.shift().stack().reset_index(name='new_value'),
on=['ref1','ref2'], how='left'
)
輸出:
ref1 ref2 value new_value
0 1 1 1 NaN
1 1 2 2 NaN
2 3 1 3 1.0
3 7 1 5 3.0
4 7 2 6 NaN
注意如果有重復的ref1, ref2,pivot將失敗。在這種情況下,您要列舉對:
df.merge(df.assign(enum=df.groupby(['ref1','ref2']).cumcount())
.pivot(index=['enum','ref1'], columns='ref2', values='value')
.shift().stack()
.reset_index(level='enum', drop=True)
.reset_index(name='new_value'),
on=['ref1','ref2'], how='left'
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/331179.html
