我想問一下,如果在 SQL 中我可以像 JOIN ON CASE WHEN 那樣做,有沒有辦法在 Pandas 中做到這一點?
disease = [
{"City":"CH","Case_Recorded":5300,"Recovered":2839,"Deaths":2461},
{"City":"NY","Case_Recorded":1311,"Recovered":521,"Deaths":790},
{"City":"TX","Case_Recorded":1991,"Recovered":1413,"Deaths":578},
{"City":"AT","Case_Recorded":3381,"Recovered":3112,"Deaths":269},
{"City":"TX","Case_Recorded":3991,"Recovered":2810,"Deaths":1311},
{"City":"LA","Case_Recorded":2647,"Recovered":2344,"Deaths":303},
{"City":"LA","Case_Recorded":4410,"Recovered":3344,"Deaths":1066}
]
region = {"North": ["AT"], "West":["TX","LA"]}
所以我有2個虛擬字典,我已經將它轉換為資料框,首先是帶有案例的城市的名稱,我試圖找出這些城市屬于哪個地區。
Region|City
North|AT
West|TX
West|LA
None|NY
None|CH
所以我在 SQL 中的想法是在 case when 上使用 left,如果在加入 North 區域時結果為 null,然后加入 West 區域。但是,如果某個國家/地區有 15 或 30 個地區,我認為這會是個問題
uj5u.com熱心網友回復:
利用:
#get City without duplicates
df1 = pd.DataFrame(disease)[['City']].drop_duplicates()
#create DataFrame from region dictionary
region = {"North": ["AT"], "West":["TX","LA"]}
df2 = pd.DataFrame([(k, x) for k, v in region.items() for x in v],
columns=['Region','City'])
#append not matched cities to df2
out = pd.concat([df2, df1[~df1['City'].isin(df2['City'])]])
print (out)
Region City
0 North AT
1 West TX
2 West LA
0 NaN CH
1 NaN NY
如果順序不重要:
out = df2.merge(df1, how = 'right')
print (out)
Region City
0 NaN CH
1 NaN NY
2 West TX
3 North AT
4 West LA
uj5u.com熱心網友回復:
對不起,我不太確定你的預期結果是什么,你能再表達一下嗎?如果您的預期結果只是獲得城市的區域,則不需要有條件加入?例如:您可以將城市區域表轉換為每行每區域每城市,并直接與主 df 連接
disease = [
{"City":"CH","Case_Recorded":5300,"Recovered":2839,"Deaths":2461},
{"City":"NY","Case_Recorded":1311,"Recovered":521,"Deaths":790},
{"City":"TX","Case_Recorded":1991,"Recovered":1413,"Deaths":578},
{"City":"AT","Case_Recorded":3381,"Recovered":3112,"Deaths":269},
{"City":"TX","Case_Recorded":3991,"Recovered":2810,"Deaths":1311},
{"City":"LA","Case_Recorded":2647,"Recovered":2344,"Deaths":303},
{"City":"LA","Case_Recorded":4410,"Recovered":3344,"Deaths":1066}
]
region = [
{'City':'AT','Region':"North"},
{'City':'TX','Region':"West"},
{'City':'LA','Region':"West"}
]
df = pd.DataFrame(disease)
df_reg = pd.DataFrame(region)
df.merge( df_reg , on = 'City' , how = 'left' )
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/525426.html
標籤:Python熊猫
