當值完全匹配時，Pandas合并不起作用-有解無憂

下面是我的代碼和資料框。stats_df大得多。不確定是否重要，但列值與實際檔案中出現的完全相同。即使兩個 DF 具有相同的 PlayerID 值“20000852”，我也無法在不丟失“Alex Len”的情況下合并兩個 DF

stats_df = pd.read_csv('stats_todate.csv')
matchup_df = pd.read_csv('matchup.csv')

new_df = pd.merge(stats_df, matchup_df[['PlayerID','Matchup','Started','GameStatus']])

我也試過：

stats_df['PlayerID'] = stats_df['PlayerID'].astype(str)
matchup_df['PlayerID'] = matchup_df['PlayerID'].astype(str)
stats_df['PlayerID'] = stats_df['PlayerID'].str.strip()
matchup_df['PlayerID'] = matchup_df['PlayerID'].str.strip()

有任何想法嗎？

這是我的兩個資料框：

DF1：

PlayerID    SeasonType  Season  Name    Team    Position
20001713    1   2018    A.J. Hammons    MIA C
20002725    2   2022    A.J. Lawson ATL SG
20002038    2   2021    ?‰lie Okobo BKN PG
20002742    2   2022    Aamir Simms NY  PF
20000518    3   2018    Aaron Brooks    MIN PG
20000681    1   2022    Aaron Gordon    DEN PF
20001395    1   2018    Aaron Harrison  DAL SG
20002680    1   2022    Aaron Henry PHI SF
20002005    1   2022    Aaron Holiday   PHO PG
20001981    3   2018    Aaron Jackson   HOU PF
20002539    1   2022    Aaron Nesmith   BOS SF
20002714    1   2022    Aaron Wiggins   OKC SG
20001721    1   2022    Abdel Nader PHO SF
20002251    2   2020    Abdul Gaddy OKC PG
20002458    1   2021    Adam Mokoka CHI SG
20002619    1   2022    Ade Murkey  SAC PF
20002311    1   2022    Admiral Schofield   ORL PF
20000783    1   2018    Adreian Payne   ORL PF
20002510    1   2022    Ahmad Caver IND PG
20002498    2   2020    Ahmed Hill  CHA PG
20000603    1   2022    Al Horford  BOS PF
20000750    3   2018    Al Jefferson    IND C
20001645    1   2019    Alan Williams   BKN PF
20000837    1   2022    Alec Burks  NY  SG
20001882    1   2018    Alec Peters PHO PF
20002850    1   2022    Aleem Ford  ORL SF
20002542    1   2022    Aleksej Poku??evski OKC PF
20002301    3   2021    Alen Smailagic  GS  PF
20001763    1   2019    Alex Abrines    OKC SG
20001801    1   2022    Alex Caruso CHI SG
20000852    1   2022    Alex Len    SAC C

DF2：

PlayerID    Name    Date    Started Opponent    GameStatus  Matchup
20000681    Aaron Gordon    4/1/2022    1   MIN     16
20002005    Aaron Holiday   4/1/2022    0   MEM     21
20002539    Aaron Nesmith   4/1/2022    0   IND     13
20002714    Aaron Wiggins   4/1/2022    1   DET     14
20002311    Admiral Schofield   4/1/2022    0   TOR     10
20000603    Al Horford  4/1/2022    1   IND     13
20002542    Aleksej Poku??evski 4/1/2022    1   DET     14
20000852    Alex Len    4/1/2022    1   HOU     22

uj5u.com熱心網友回復：

on您需要使用關鍵字引數指定要合并的列：

new_df = pd.merge(stats_df, matchup_df[['PlayerID','Matchup','Started','GameStatus']], on=['PayerID'])

否則它將使用所有共享列合并。

這是熊貓檔案的解釋：

on：標簽或串列要加入的列或索引級別名稱。這些必須在兩個 DataFrame 中都可以找到。如果on是 None 并且不合并索引，則默認為兩個 DataFrame 中列的交集。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/454535.html

標籤：Python 熊猫

上一篇：傳播然后減去熊貓資料框中的連續行

下一篇：動態增加串列到Python函式