我必須將資料從 SQL 表轉換為 Pandas 并顯示輸出。資料是一個銷售表:
cust prod day month year state quant
0 Bloom Pepsi 2 12 2017 NY 4232
1 Knuth Bread 23 5 2017 NJ 4167
2 Emily Pepsi 22 1 201 CT 4404
3 Emily Fruits 11 1 2010 NJ 4369
4 Helen Milk 7 11 2016 CT 210
我必須將其轉換為 2017 年每個州每個客戶的平均銷售額:
CUST AVG_NY AVG_CT AVG_NJ
Bloom 28923 3241 1873
Sam 4239 872 142
下面是我的代碼:
import pandas as pd
import psycopg2 as pg
engine = pg.connect("dbname='postgres' user='postgres' host='127.0.0.1' port='8800' password='sh'")
df = pd.read_sql('select * from sales', con=engine)
df.drop("prod", axis=1, inplace=True)
df.drop("day", axis=1, inplace=True)
df.drop("month", axis=1, inplace=True)
df_main = df.loc[df.year == 2017]
#df.drop(df[df['state'] != 'NY'].index, inplace=True)
df2 = df_main.loc[df_main.state == 'NY']
df2.drop("year",axis=1,inplace=True)
NY = df2.groupby(['cust']).mean()
df3 = df_main.loc[df_main.state == 'CT']
df3.drop("year",axis=1,inplace=True)
CT = df3.groupby(['cust']).mean()
df4 = df_main.loc[df_main.state == 'NJ']
df4.drop("year",axis=1,inplace=True)
NJ = df4.groupby(['cust']).mean()
NY = NY.join(CT,how='left',lsuffix = 'NY', rsuffix = '_right')
NY = NY.join(NJ,how='left',lsuffix = 'NY', rsuffix = '_right')
print(NY)
這給了我一個輸出,如:
quantNY quant_right quant
cust
Bloom 3201.500000 3261.0 2277.000000
Emily 2698.666667 1432.0 1826.666667
Helen 4909.000000 2485.5 2352.166667
我發現了一個問題,我可以將列名更改為我需要的輸出,但我不確定以下兩行代碼是否是加入資料框的正確方法:
NY = NY.join(CT,how='left',lsuffix = 'NY', rsuffix = '_right')
NY = NY.join(NJ,how='left',lsuffix = 'NY', rsuffix = '_right')
有沒有更好的方法用 Pandas 做到這一點?
uj5u.com熱心網友回復:
使用pivot_table:
df.pivot_table(index=['year', 'cust'], columns='state',
values='quant', aggfunc='mean').add_prefix('AVG_')
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/318188.html
標籤:Python 熊猫 PostgreSQL 数据框
下一篇:沒有反應鉤子的反應中的URL引數
