下面是我的代碼和資料框。stats_df大得多。不確定是否重要,但列值與實際檔案中出現的完全相同。即使兩個 DF 具有相同的 PlayerID 值“20000852”,我也無法在不丟失“Alex Len”的情況下合并兩個 DF
stats_df = pd.read_csv('stats_todate.csv')
matchup_df = pd.read_csv('matchup.csv')
new_df = pd.merge(stats_df, matchup_df[['PlayerID','Matchup','Started','GameStatus']])
我也試過:
stats_df['PlayerID'] = stats_df['PlayerID'].astype(str)
matchup_df['PlayerID'] = matchup_df['PlayerID'].astype(str)
stats_df['PlayerID'] = stats_df['PlayerID'].str.strip()
matchup_df['PlayerID'] = matchup_df['PlayerID'].str.strip()
有任何想法嗎?
這是我的兩個資料框:
DF1:
PlayerID SeasonType Season Name Team Position
20001713 1 2018 A.J. Hammons MIA C
20002725 2 2022 A.J. Lawson ATL SG
20002038 2 2021 ?‰lie Okobo BKN PG
20002742 2 2022 Aamir Simms NY PF
20000518 3 2018 Aaron Brooks MIN PG
20000681 1 2022 Aaron Gordon DEN PF
20001395 1 2018 Aaron Harrison DAL SG
20002680 1 2022 Aaron Henry PHI SF
20002005 1 2022 Aaron Holiday PHO PG
20001981 3 2018 Aaron Jackson HOU PF
20002539 1 2022 Aaron Nesmith BOS SF
20002714 1 2022 Aaron Wiggins OKC SG
20001721 1 2022 Abdel Nader PHO SF
20002251 2 2020 Abdul Gaddy OKC PG
20002458 1 2021 Adam Mokoka CHI SG
20002619 1 2022 Ade Murkey SAC PF
20002311 1 2022 Admiral Schofield ORL PF
20000783 1 2018 Adreian Payne ORL PF
20002510 1 2022 Ahmad Caver IND PG
20002498 2 2020 Ahmed Hill CHA PG
20000603 1 2022 Al Horford BOS PF
20000750 3 2018 Al Jefferson IND C
20001645 1 2019 Alan Williams BKN PF
20000837 1 2022 Alec Burks NY SG
20001882 1 2018 Alec Peters PHO PF
20002850 1 2022 Aleem Ford ORL SF
20002542 1 2022 Aleksej Poku??evski OKC PF
20002301 3 2021 Alen Smailagic GS PF
20001763 1 2019 Alex Abrines OKC SG
20001801 1 2022 Alex Caruso CHI SG
20000852 1 2022 Alex Len SAC C
DF2:
PlayerID Name Date Started Opponent GameStatus Matchup
20000681 Aaron Gordon 4/1/2022 1 MIN 16
20002005 Aaron Holiday 4/1/2022 0 MEM 21
20002539 Aaron Nesmith 4/1/2022 0 IND 13
20002714 Aaron Wiggins 4/1/2022 1 DET 14
20002311 Admiral Schofield 4/1/2022 0 TOR 10
20000603 Al Horford 4/1/2022 1 IND 13
20002542 Aleksej Poku??evski 4/1/2022 1 DET 14
20000852 Alex Len 4/1/2022 1 HOU 22
uj5u.com熱心網友回復:
on您需要使用關鍵字引數指定要合并的列:
new_df = pd.merge(stats_df, matchup_df[['PlayerID','Matchup','Started','GameStatus']], on=['PayerID'])
否則它將使用所有共享列合并。
這是熊貓檔案的解釋:
on:標簽或串列要加入的列或索引級別名稱。這些必須在兩個 DataFrame 中都可以找到。如果on是 None 并且不合并索引,則默認為兩個 DataFrame 中列的交集。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/454535.html
上一篇:傳播然后減去熊貓資料框中的連續行
下一篇:動態增加串列到Python函式
