我想為我的df_cau2['continent']. 首先是我的 r 2 df:
country_continent
Continent
Country
Afghanistan Asia
Albania Europe
Algeria Africa
American Samoa Oceania
and
df_cau2
date home_team away_team home_score away_score tournament city country neutral
0 1872-11-30 Scotland England 0 0 Friendly Glasgow Scotland False
1 1873-03-08 England Scotland 4 2 Friendly London England False
2 1874-03-07 Scotland England 2 1 Friendly Glasgow Scotland False
創建新列continent,我使用 apply for df_cau2 像這樣:
def same_continent(home,away):
if country_continent.loc[home].Continent == country_continent.loc[away].Continent:
return country_continent.loc[home].Continent
return 'None'
df_cau2['continent']=df_cau2.apply(lambda x: same_continent(x['home_team'],x['away_team']),axis=1)
df_cau2.head()
使用 39480 行 df_cau2,此代碼運行速度太慢,如何更改我的代碼以使其運行得更快?我正在考慮使用np.select,但在這種情況下我不知道如何使用它。
這是我想要的結果:
date home_team away_team home_score away_score tournament city country neutral continent
7611 1970-09-11 Iran Turkey 1 1 Friendly Teheran Iran False None
31221 2009-03-11 Nepal Pakistan 1 0 Friendly Kathmandu Nepal False Asia
32716 2010-11-17 Colombia Peru 1 1 Friendly Bogotá Colombia False South America
謝謝
uj5u.com熱心網友回復:
continentIIUC,只有當home_team和away_team列位于同一大陸時,您才想設定列:
home_continent = df1['home_team'].map(df2.squeeze())
away_continent = df1['away_team'].map(df2.squeeze())
m = home_continent == away_continent
df1.loc[m, 'continent'] = home_continent.loc[m]
print(df1)
# Output
home_team away_team continent
0 Canada England NaN
1 France Spain Europe
2 China Japan Asia
設定 MRE
df1 = pd.DataFrame({'home_team': ['Canada', 'France', 'China'],
'away_team': ['England', 'Spain', 'Japan']})
print(df1)
df2 = pd.DataFrame({'Country': ['Canada', 'China', 'England',
'France', 'Japan', 'Spain'],
'Continent': ['North America', 'Asia', 'Europe',
'Europe', 'Asia', 'Europe']}).set_index('Country')
print(df2)
# Output df1
home_team away_team
0 Canada England
1 France Spain
2 China Japan
# Output df2
Continent
Country
Canada North America
China Asia
England Europe
France Europe
Japan Asia
Spain Europe
uj5u.com熱心網友回復:
考慮merge兩次大陸查找資料框以創建主大陸和客大陸列。并且由于您將擁有兩個大陸,因此有條件地分配新的共享大陸列numpy.where:
df_cau2 = (
df.cau2.merge(
country_continent.reset_index(),
left_on = "home_team",
right_on = "Country",
how = "left"
).merge(
country_continent.reset_index(),
left_on = "away_team",
right_on = "Country",
how = "left",
suffixes = ["_home", "_away"]
)
)
df_cau2["shared_continent"] = np.where(
df_cau2["Continent_home"].eq(df_cau2["Continent_away"]),
df_cau2["Continent_home"],
np.nan
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/412354.html
標籤:
