我有兩個都包含坐標的資料框。其中一個,df1,具有更好的解析度(帶小數)的坐標,我想加入它,df2它的解析度不太好:
import pandas as pd
df1 = pd.DataFrame({'x': [1.1, 2.2, 3.3],
'y': [2.3, 3.3, 4.1],
'val': [10,11,12]})
df2 = pd.DataFrame({'x': [1,2,3,5.5],
'y': [2,3,4,5.6]})
df1['x_org']=df1['x']
df1['y_org']=df1['y']
df1[['x','y']] = df1[['x','y']].round()
df1 = pd.merge(df1, df2, how='left', on=['x','y'])
df1.drop({'x','y'}, axis=1)
# rename...
上面的代碼正是我想要的,但它有點麻煩。有沒有更簡單的方法來實作這一點?
uj5u.com熱心網友回復:
采用:
df1.merge(df2,
how='left',
left_on=[df1['x'].round(), df1['y'].round()],
right_on=['x','y'],
suffixes=('','_')).drop(['x_','y_'], axis=1)
也可以洗掉以_動態結尾的列:
df = df1.merge(df2,
how='left',
left_on=[df1['x'].round(), df1['y'].round()],
right_on=['x','y'],
suffixes=('','_')).filter(regex='.*[^_]$')
print (df)
x y val
0 1.1 2.3 10
1 2.2 3.3 11
2 3.3 4.1 12
df = df1.merge(df2,
how='left',
left_on=[df1['x'].round(), df1['y'].round()],
right_on=['x','y'],
suffixes=('','_end')).filter(regex='.*(?<!_end)$')
print (df)
x y val
0 1.1 2.3 10
1 2.2 3.3 11
2 3.3 4.1 12
要么:
df = (df1.set_index(['x','y'], drop=False).rename(lambda x: round(x))
.merge(df2.set_index(['x','y']),
left_index=True,
right_index=True,
how='left').reset_index(drop=True))
print (df)
x y val
0 1.1 2.3 10
1 2.2 3.3 11
2 3.3 4.1 12
uj5u.com熱心網友回復:
IIUC,您可以將四舍五入的值作為連接鍵傳遞:
pd.merge(df1.rename(columns={'x': 'x_org', 'y': 'y_org'}),
df2,
how='left',
left_on=[df1['x'].round(), df1['x'].round()],
right_on=['x', 'y'])#.drop({'x','y'}, axis=1) # if x/y are unwanted
輸出:
x_org y_org val x y
0 1.1 2.3 10 1.0 1.0
1 2.2 3.3 11 2.0 2.0
2 3.3 4.1 12 3.0 3.0
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/442200.html
上一篇:Doctrine水合關系錯誤,但直接應用SQL后的結果集是正確的
下一篇:根據表中的多列過濾SQL中的資料
