我有當前比賽的比賽和主隊結果的資料集
match_date home away home_result
2021-11-22 team1 team2 Win
2021-11-22 team3 team4 Win
2021-11-23 team1 team8 Lose
2021-11-23 team6 team7 Win
2021-11-25 team1 team2 Win
2021-11-25 team3 team8 Lose
2021-11-25 team1 team5 Lose
2021-11-25 team6 team5 Win
2021-11-28 team3 team1 Lose
2021-11-29 team1 team5 Win
2021-11-29 team6 team9 Win
我想創建新的列,我可以在當前比賽之前為每個主隊放置之前的結果,例如 team1 在 2021-11-22(沒有之前的比賽)和 2021-11-23(之前的比賽 team1 Win)和 2021 -11-25(前幾場比賽 team1 贏、輸)和 2021-11-29(前場比賽 team1 贏、輸、輸) 這是預期的列:
match_date home away home_result home_team_previous_results
2021-11-22 team1 team2 Win NaN
2021-11-22 team3 team4 Win NaN
2021-11-23 team1 team8 Lose [("Win","2021-11-22")]
2021-11-23 team6 team7 Win NaN
2021-11-25 team1 team2 Win [("Win","2021-11-22"), ("Lose","2021-11-23")]
2021-11-25 team3 team8 Lose [("Win","2021-11-22")]
2021-11-25 team1 team5 Lose [("Win","2021-11-22"), ("Lose","2021-11-23"), ("Win","2021-11-25")]
2021-11-25 team6 team5 Win [("Win","2021-11-23")]
2021-11-28 team3 team1 Lose [("Win","2021-11-22"), ("Lose","2021-11-25")]
2021-11-29 team1 team5 Win [("Win","2021-11-22"), ("Lose","2021-11-23"), ("Win","2021-11-25"), ("Lose","2021-11-25")]
2021-11-29 team6 team9 Win [("Win","2021-11-23"), ("Win","2021-11-25")]
uj5u.com熱心網友回復:
一個粗糙的解決方案:
df['home_team_previous_results'] = (
df.groupby('home')
.apply(
lambda x: pd.Series(
[
[
tuple([row[col] for col in ['home_result', 'match_date']])
for _, row in x.iloc[0:i].iterrows()
] or np.nan
for i in range(len(x))
],
index=x.index)
).droplevel(0)
)
輸出:
>>> df
match_date home away home_result home_team_previous_results
0 2021-11-22 team1 team2 Win NaN
1 2021-11-22 team3 team4 Win NaN
2 2021-11-23 team1 team8 Lose [(Win, 2021-11-22)]
3 2021-11-23 team6 team7 Win NaN
4 2021-11-25 team1 team2 Win [(Win, 2021-11-22), (Lose, 2021-11-23)]
5 2021-11-25 team3 team8 Lose [(Win, 2021-11-22)]
6 2021-11-25 team1 team5 Lose [(Win, 2021-11-22), (Lose, 2021-11-23), (Win, ...
7 2021-11-25 team6 team5 Win [(Win, 2021-11-23)]
8 2021-11-28 team3 team1 Lose [(Win, 2021-11-22), (Lose, 2021-11-25)]
9 2021-11-29 team1 team5 Win [(Win, 2021-11-22), (Lose, 2021-11-23), (Win, ...
10 2021-11-29 team6 team9 Win [(Win, 2021-11-23), (Win, 2021-11-25)]
單線版:
df['home_team_previous_results'] = df.groupby('home').apply(lambda x: pd.Series([[tuple([row[col] for col in ['home_result', 'match_date']]) for _, row in x.iloc[0:i].iterrows()] or np.nan for i in range(len(x))], index=x.index)).droplevel(0)
uj5u.com熱心網友回復:
不幸的是,我不相信 Pandas 支持有效的解決方案。
assert isinstance(df.index, pd.RangeIndex) # This solution assumes a RangeIndex
df['home_team_previous_results'] = pd.Series(dtype=object)
team_frames = dict(list(df.groupby('home')))
for i, row in df.iterrows():
previous = team_frames[row['home']].loc[:i-1, ['home_result', 'match_date']]
records = list(previous.to_records(index=False)) or float('nan')
df.at[i, 'home_team_previous_results'] = records
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/392005.html
