給定一個資料df如下:
import pandas as pd
data = [[1, 'A1', 'A1'], [2, 'A2', 'B2', 1, 1], [3, 'B3', 'B3', 3, 2], [4, None, None]]
df = pd.DataFrame(data, columns=['id', 'v1','v2','v3','v4'])
print(df)
出去:
id v1 v2 v3 v4
0 1 A1 A1 NaN NaN
1 2 A2 B2 1.0 1.0
2 3 B3 B3 3.0 2.0
3 4 None None NaN NaN
假設我需要檢查多個列對是否具有相同的內容或相同的值:
col_pair = {'v1': 'v2', 'v3': 'v4'}
如果我不想重復np.where多次,而是希望應用col_pair或其他可能的解決方案,我怎么能做到這一點?謝謝。
df['v1_v2'] = np.where(df['v1'] == df['v2'], 1, 0)
df['v3_v4'] = np.where(df['v3'] == df['v4'], 1, 0)
預期結果:
id v1 v2 v3 v4 v1_v2 v3_v4
0 1 A1 A1 NaN NaN 1 NaN
1 2 A2 B2 1.0 1.0 0 1
2 3 B3 B3 3.0 2.0 1 0
3 4 None None NaN NaN NaN NaN
uj5u.com熱心網友回復:
您還需要測驗鍵值對中的兩個值是否都在DataFrame.isnawith中丟失DataFrame.all并傳遞給numpy.select:
for k, v in col_pair.items():
df[f'{k}_{v}'] = np.select([df[[k, v]].isna().all(axis=1),
df[k] == df[v]], [None,1], default=0)
出去:
id v1 v2 v3 v4 v1_v2 v3_v4
0 1 A1 A1 NaN NaN 1 None
1 2 A2 B2 1.0 1.0 0 1
2 3 B3 B3 3.0 2.0 1 0
3 4 None None NaN NaN None None
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/312987.html
上一篇:使用pandasloc時的索引
下一篇:使用分類器列過濾熊貓中的資料框
