我正在使用 pandas 庫作為資料框。在以下資料中,對于每個團隊,每年(2020 年、2019 年、2018 年)每個月(1-6 分)都有積分。
month team points2020 points2019 points2018
1 team1 50 10 5
2 team1 20 40 2
3 team1 12 14 17
4 team1 8 9 3
5 team1 2 3 1
6 team1 30 18 60
1 team2 8 9 10
2 team2 40 70 30
3 team2 25 19 34
4 team2 88 70 1
5 team2 23 45 5
6 team2 55 77 90
我要顯示的是每個月,只顯??示每年得分最低的團隊
因此,例如,根據上述資料,對于“points2020”的月份“1”,我只想在“team”列中回傳 team2,因為 team2 的“points2020”得分最低。
對于 points2019 的月份“1”,我只想回傳 team2,在 team 列中,因為 team2 的“points2019”得分最低,依此類推。
我將如何實作這一目標?
所需輸出的示例:
month year team points
1 2020 team2 8
2 2020 team1 20
3 2020 team1 12
4 2020 team1 8
5 2020 team1 2
6 2020 team1 30
1 2019 team2 9
2 2019 team1 40
3 2019 team1 14
4 2019 team1 9
5 2019 team1 3
6 2019 team1 18
uj5u.com熱心網友回復:
將列定義team為索引和分組,month然后用于idxmin提取得分最低的團隊(索引):
out = df.set_index('team').groupby('month', as_index=False).idxmin()
print(out)
# Output
month points2020 points2019 points2018
0 1 team2 team2 team1
1 2 team1 team1 team1
2 3 team1 team1 team1
3 4 team1 team1 team2
4 5 team1 team1 team1
5 6 team1 team1 team1
uj5u.com熱心網友回復:
用于將df.melt列轉換為行,然后在 groupby 為我作業后找到具有最小值的行:
首先,將點列轉換為行(創建“年”和“點”列)
>> df = df.melt(id_vars=["month", "team"], var_name="year", value_name="points") >> print(df.head()) month team year points 0 1 team1 points2020 50 1 2 team1 points2020 20 2 3 team1 points2020 12 3 4 team1 points2020 8 4 5 team1 points2020 2對于每個月和年,找到分數最低的行
>> df = df.loc[df.groupby(["month", "year"]).points.idxmin()]以與預期輸出匹配的方式對值進行排序
>> print(df.sort_values(["year", "month"])) month team year points 24 1 team1 points2018 5 25 2 team1 points2018 2 26 3 team1 points2018 17 33 4 team2 points2018 1 28 5 team1 points2018 1 29 6 team1 points2018 60 18 1 team2 points2019 9 13 2 team1 points2019 40 14 3 team1 points2019 14 15 4 team1 points2019 9 16 5 team1 points2019 3 17 6 team1 points2019 18 6 1 team2 points2020 8 1 2 team1 points2020 20 2 3 team1 points2020 12 3 4 team1 points2020 8 4 5 team1 points2020 2 5 6 team1 points2020 30
uj5u.com熱心網友回復:
嘗試這個:
s = df.set_index(['month','team']).stack().rename_axis(['month','team','year'])
(s.loc[s.groupby(level=[0,2]).idxmin()]
.sort_index(level=[2,0],ascending=[0,1])
.reset_index(name='points')
.assign(year = lambda x: x['year'].str.extract('(\d )',expand=False)))
輸出:
month team year points
0 1 team2 2020 8
1 2 team1 2020 20
2 3 team1 2020 12
3 4 team1 2020 8
4 5 team1 2020 2
5 6 team1 2020 30
6 1 team2 2019 9
7 2 team1 2019 40
8 3 team1 2019 14
9 4 team1 2019 9
10 5 team1 2019 3
11 6 team1 2019 18
uj5u.com熱心網友回復:
在計算 groupby 聚合之前,您需要從寬轉換為長:
(
pd.wide_to_long(df, stubnames="points", i=["month", "team"], j="year")
.reset_index()
.groupby(["month", "year"], as_index=False, sort=False)
.agg(points=("points", "min"))
)
month year points
0 1 2020 8
1 1 2019 9
2 1 2018 5
3 2 2020 20
4 2 2019 40
5 2 2018 2
6 3 2020 12
7 3 2019 14
8 3 2018 17
9 4 2020 8
10 4 2019 9
11 4 2018 1
12 5 2020 2
13 5 2019 3
14 5 2018 1
15 6 2020 30
16 6 2019 18
17 6 2018 60
另一種選擇是先進行 groupby,然后再轉換為長格式(轉換為長格式時行數較少):
temp = df.groupby("month").min()
temp = temp.set_index('team', append = True)
temp.columns = temp.columns.str.split("(\d )", expand = True).droplevel(-1)
temp.columns.names = [None, 'year']
temp.stack().reset_index()
month team year points
0 1 team1 2018 5
1 1 team1 2019 9
2 1 team1 2020 8
3 2 team1 2018 2
4 2 team1 2019 40
5 2 team1 2020 20
6 3 team1 2018 17
7 3 team1 2019 14
8 3 team1 2020 12
9 4 team1 2018 1
10 4 team1 2019 9
11 4 team1 2020 8
12 5 team1 2018 1
13 5 team1 2019 3
14 5 team1 2020 2
15 6 team1 2018 60
16 6 team1 2019 18
17 6 team1 2020 30
上面的步驟可以用pivot_longerfrom抽象出來pyjanitor:
# pip install pyjanitor
import pandas as pd
import janitor
(df
.groupby("month", as_index=False)
.min()
.pivot_longer(index = ["month", "team"],
names_to = (".value", "year"),
names_pattern = r"(\D )(\d )")
)
month team year points
0 1 team1 2020 8
1 2 team1 2020 20
2 3 team1 2020 12
3 4 team1 2020 8
4 5 team1 2020 2
5 6 team1 2020 30
6 1 team1 2019 9
7 2 team1 2019 40
8 3 team1 2019 14
9 4 team1 2019 9
10 5 team1 2019 3
11 6 team1 2019 18
12 1 team1 2018 5
13 2 team1 2018 2
14 3 team1 2018 17
15 4 team1 2018 1
16 5 team1 2018 1
17 6 team1 2018 60
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/428699.html
