假設我有一些資料框,其中一列有一些值多次出現形成組(A片段中的列)。現在我想創建一個新列,例如每個組1的第一個x(列C)條目和0其他條目。我設法完成了第一部分,但是我沒有找到將條件包含在xes 上的好方法,有沒有好的方法呢?
import pandas as pd
df = pd.DataFrame(
{
"A": ["0", "0", "1", "2", "2", "2"], # data to group by
"B": ["a", "b", "c", "d", "e", "f"], # some other irrelevant data to be preserved
"C": ["y", "x", "y", "x", "y", "x"], # only consider the 'x'
}
)
target = pd.DataFrame(
{
"A": ["0", "0", "1", "2", "2", "2"],
"B": ["a", "b", "c", "d", "e", "f"],
"C": ["y", "x", "y", "x", "y", "x"],
"D": [ 0, 1, 0, 1, 0, 0] # first entry per group of 'A' that has an 'C' == 'x'
}
)
# following partial solution doesn't account for filtering by 'x' in 'C'
df['D'] = df.groupby('A')['C'].transform(lambda x: [1 if i == 0 else 0 for i in range(len(x))])
uj5u.com熱心網友回復:
在你的情況下做切片然后drop_duplicates分配回來
df['D'] = df.loc[df.C=='x'].drop_duplicates('A').assign(D=1)['D']
df['D'].fillna(0,inplace=True)
df
Out[149]:
A B C D
0 0 a y 0.0
1 0 b x 1.0
2 1 c y 0.0
3 2 d x 1.0
4 2 e y 0.0
5 2 f x 0.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/331183.html
