我正在嘗試計算每一行之間的差異,但我有一個我不知道如何解決的問題。下面是我的示例代碼:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Place': ['Hanoi','Hanoi','Hanoi','Hanoi','Hanoi','Hochiminh','Hochiminh','Hochiminh','Hochiminh','Hochiminh'],
'Date': ['2022-04-01','2022-04-02','2022-04-03','2022-04-04','2022-04-05','2022-04-01','2022-04-02','2022-04-03','2022-04-04','2022-04-05'],
'Number': [0,2,4,6,8,12,17,20,26,28]})
這是輸出:
Place Date Number
0 Hanoi 2022-04-01 0
1 Hanoi 2022-04-02 2
2 Hanoi 2022-04-03 4
3 Hanoi 2022-04-04 6
4 Hanoi 2022-04-05 8
5 Hochiminh 2022-04-01 12
6 Hochiminh 2022-04-02 17
7 Hochiminh 2022-04-03 20
8 Hochiminh 2022-04-04 26
9 Hochiminh 2022-04-05 28
然后我用來diff計算不同的數字:
df['diff'] = df['Number'].diff()
它的輸出:
Place Date Number diff
0 Hanoi 2022-04-01 0 NaN
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 4.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0
如您所見row 5,列diff是通過減去row 5列號和row 4列號來計算的,但這不是我想要的。我希望將第 5 行重置為NaN或0因為我想為每個地方計算不同的數字。
這是我的預期輸出
Place Date Number diff
0 Hanoi 2022-04-01 0 NaN
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 0.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0
uj5u.com熱心網友回復:
df['diff'] = df.groupby('Place')['Number'].diff().fillna(0)
df
Place Date Number diff
0 Hanoi 2022-04-01 0 0.0
1 Hanoi 2022-04-02 2 2.0
2 Hanoi 2022-04-03 4 2.0
3 Hanoi 2022-04-04 6 2.0
4 Hanoi 2022-04-05 8 2.0
5 Hochiminh 2022-04-01 12 0.0
6 Hochiminh 2022-04-02 17 5.0
7 Hochiminh 2022-04-03 20 3.0
8 Hochiminh 2022-04-04 26 6.0
9 Hochiminh 2022-04-05 28 2.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/464359.html
上一篇:如何將新索引附加到資料框?
