Pythonpandas合并地圖與多個值xlookup-有解無憂

我有一個演員姓名的資料框：

df1

actor_id    actor_name
1   Brad Pitt
2   Nicole Kidman
3   Matthew Goode
4   Uma Thurman
5   Ethan Hawke

演員所在的電影的另一個資料框：

df2

actor_id    actor_movie movie_revenue_m
1   Once Upon a Time in Hollywood   150
2   The Others  50
2   Moulin Rouge    200
3   Stoker  75
4   Kill Bill   125
5   Gattaca 85

我想將兩個資料框合并在一起，以向演員展示他們的電影名稱和電影收入，所以我使用了合并函式：

df3 = df1.merge(df2, on = 'actor_id', how = 'left')

df3

actor_id    actor_name  actor_movie movie_revenue
1   Brad Pitt   Once Upon a Time in Hollywood   150
2   Nicole Kidman   Moulin Rouge    50
2   Nicole Kidman   The Others  200
3   Matthew Goode   Stoker  75
4   Uma Thurman Kill Bill   125
5   Ethan Hawke Gattaca 85

但這會涉及所有電影，所以妮可基德曼被復制了，我只想為每個演員放映一部電影。如何在不“復制”我的演員串列的情況下合并資料框？

我將如何合并按字母順序排列的電影標題？

我將如何合并收入最高的電影名稱？

謝謝！

uj5u.com熱心網友回復：

一種方法是繼續合并，然后過濾結果集

按字母順序排列的電影標題

# sort by name, movie and then pick the first while grouping by actor
df.sort_values(['actor_name','actor_movie'] ).groupby('actor_id', as_index=False).first()

    actor_id    actor_name  actor_movie     movie_revenue
0   1   Brad Pitt   Once Upon a Time in Hollywood   150
1   2   Nicole Kidman   Moulin Rouge    50
2   3   Matthew Goode   Stoker  75
3   4   Uma Thurman     Kill Bill   125
4   5   Ethan Hawke     Gattaca     85

收入最高的電影名稱

# sort by name, and review (descending), groupby actor and pick first
df.sort_values(['actor_name','movie_revenue'], ascending=[1,0] ).groupby('actor_id', as_index=False).first()

    actor_id    actor_name  actor_movie     movie_revenue
0   1   Brad Pitt   Once Upon a Time in Hollywood   150
1   2   Nicole Kidman   The Others  200
2   3   Matthew Goode   Stoker  75
3   4   Uma Thurman     Kill Bill   125
4   5   Ethan Hawke     Gattaca     85

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/527136.html

標籤：Python熊猫合并查找

上一篇：按月選擇df行格式為(lambdax:datetime.datetime.strptime(x,'%Y-%m-%dT%H:%M:%S%z'))

下一篇：嘗試獲取第二小的日期Python時Groupby函式拋出錯誤