d1 = {'id': ['a','b','c'], 'ref': ['apple','orange','banana']}
df1 = pd.DataFrame(d1)
d2 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana']}
df2 = pd.DataFrame(d2)
我想看看df1中id和ref的列對是否存在于df2中。我想在 df2 中創建一個布爾列來完成這個。
期望輸出:
d3 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana'], 'check':[True,True,False]}
df2 = pd.DataFrame(d3)
我已經嘗試了以下以及一個簡單的分配/isin
df2['check'] = df2[['id','ref']].isin(df1[['id','ref']].values.ravel()).any(axis=1)
df2['check'] = df2.apply(lambda x: x.isin(df1.stack())).any(axis=1)
我如何在沒有合并的情況下做到這一點?
uj5u.com熱心網友回復:
我不確定你為什么不喜歡合并,但你可以使用isinwith tuple:
df2['check'] = df2[['id','ref']].apply(tuple, axis=1)\
.isin(df1[['id','ref']].apply(tuple, axis=1))
輸出:
id ref check
0 a apple True
1 b orange True
2 d banana False
uj5u.com熱心網友回復:
我認為這就是你要找的:
d1 = {'id': ['a','b','c'], 'ref': ['apple','orange','banana']}
df1 = pd.DataFrame(d1)
d2 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana']}
df2 = pd.DataFrame(d2)
result = df1.loc[df1.id.isin(df2.id) & df2.ref.isin(df2.ref)]
雖然合并幾乎肯定會更有效:
#create a compound key with id ref
df1["key"] = df1.apply(lambda row: f'{row["id"]}_{row["ref"]}', axis=1)
df2["key"] = df2.apply(lambda row: f'{row["id"]}_{row["ref"]}', axis=1)
#merge df2 on df1 on compound key
df3 = df1.merge(df2, on="key")
#locate the matched keys in df1
result = df1.set_index("id").loc[df3.id_x]
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/383424.html
上一篇:是否有一個numpy(或Python)函式來關聯2Dnumpy陣列(n,m)的每一列
下一篇:Python-Scipy和Numpy不相處-在scipy.optimize.curve_fit中使用numpy陣列的問題
