假設我們有一個如下所示的資料框:
channel store units
Offline Bournemouth 62
Offline Kettering 90
Offline Manchester 145
Online Bournemouth 220
Online Kettering 212
Online Manchester 272
我的目的是再添加兩列,其中包含每個渠道銷售的全部單位數量以及每個商店在每個渠道中所代表的份額。簡而言之,我希望達到的輸出應如下所示:
channel store units units_per_channel store_share
Offline Bournemouth 62 297 0.21
Offline Kettering 90 297 0.30
Offline Manchester 145 297 0.49
Online Bournemouth 220 704 0.31
Online Kettering 212 704 0.30
Online Manchester 272 704 0.39
有沒有簡單而優雅的方法來獲得這個?
uj5u.com熱心網友回復:
在.grouby()上做一個channel,得到 的總和units。然后簡單地劃分units由units_per_channel
import pandas as pd
df = pd.DataFrame([['Offline', 'Bournemouth', 62],
['Offline' , 'Kettering' , 90],
['Offline' , 'Manchester' , 145],
['Online' , 'Bournemouth', 220],
['Online' , 'Kettering', 212],
['Online' , 'Manchester', 272]],
columns=['channel','store','units'],)
df['units_per_channel'] = df.groupby('channel')['units'].transform('sum')
df['store_share'] = df['units'] / df['units_per_channel']
輸出:
print(df)
channel store units units_per_channel store_share
0 Offline Bournemouth 62 297 0.208754
1 Offline Kettering 90 297 0.303030
2 Offline Manchester 145 297 0.488215
3 Online Bournemouth 220 704 0.312500
4 Online Kettering 212 704 0.301136
5 Online Manchester 272 704 0.386364
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/329047.html
上一篇:如何在串列中列印出正確的答案
