我正在用 Pandas 學習 Python 資料分析
我有一個游戲銷售資料框,看起來像這樣:
(此資料不真實,僅供提問)
Name Year Publisher Total Sales
GTA V 2013 Rockstar 133000
Super Mario Bros 1985 Nintendo 430500
GTA VI 2025 Rockstar 86000
RDR 3 2025 Rockstar 129030
Super Mario Sister 1985 Nintendo 308900
Super Mario End 2000 Nintendo 112100
然后我洗掉名稱并使用以下命令按發布者名稱對其進行分組:
df.drop(columns='Name', inplace=True)
df.groupby(['Publisher','Year','Total Sales']).sum().reset_index()
資料框現在看起來像這樣:
Publisher Year Total Sales
Nintendo 1985 308900
Nintendo 1985 430500
Nintendo 2000 112100
Rockstar 2013 133000
Rockstar 2025 129030
Rockstar 2025 86000
這很好,但我想總結同一出版商同年的總銷售額
我希望資料框看起來像這樣:
Publisher Year Total Sales
Nintendo 1985 739400
Nintendo 2000 86000
Rockstar 2013 129030
Rockstar 2025 215030
有沒有辦法做到這一點?
這是我的 df 代碼:
data = {'Name':['GTA V','Super Mario Bros','GTA VI','RDR 3','Super Mario Sister','Super Mario End'],'Year':['2013','1985','2025','2025','1985','2000'],
'Publisher':['Rockstar','Nintendo','Rockstar','Rockstar','Nintendo','Nintendo'],'Total Sales':['133000','430500','86000','129030','308900','112100']}
df = pd.DataFrame(data)
df
uj5u.com熱心網友回復:
使用pivot_table:
>>> df.pivot_table('Total Sales', ['Year', 'Publisher'], aggfunc='sum').reset_index()
Year Publisher Total Sales
0 1985 Nintendo 739400
1 2000 Nintendo 112100
2 2013 Rockstar 133000
3 2025 Rockstar 215030
注意:如果Total Sales列包含字串,請將其轉換為int(或float):
>>> df.astype({'Total Sales': int}).pivot_table(...)
uj5u.com熱心網友回復:
import pandas as pd
data = {'Name':['GTA V','Super Mario Bros','GTA VI','RDR 3','Super Mario Sister','Super Mario End'],'Year':['2013','1985','2025','2025','1985','2000'],
'Publisher':['Rockstar','Nintendo','Rockstar','Rockstar','Nintendo','Nintendo'],'Total Sales':['133000','430500','86000','129030','308900','112100']}
df = pd.DataFrame(data)
df['Total Sales'] = df['Total Sales'].astype(int)
df.groupby(['Year', 'Publisher'])['Total Sales'].agg('sum').reset_index()
uj5u.com熱心網友回復:
這是一種方法:
df.drop(columns='Name', inplace=True)
df['Total Sales'] = pd.to_numeric(df['Total Sales'])
df2 = df.groupby(['Publisher','Year']).sum().reset_index()
df2
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/402319.html
標籤:
