用python分析資料框-有解無憂

我希望能夠計算每個射手姓名的平均“目標”、“射門”和“失誤”，以用于進一步分析和可視化

下面的代碼為我提供了按“shooterName”排序的“event”列中的 3 個屬性（shot、goal、miss）的計數

資料框列：

season  period  time    teamCode    event   goal    xCord   yCord   xCordAdjusted   yCordAdjusted   ... playerPositionThatDidEvent  timeSinceFaceoff    playerNumThatDidEvent   shooterPlayerId shooterName shooterLeftRight    shooterTimeOnIce    shooterTimeOnIceSinceFaceoff    shotDistance

對應資料

2020    1   16  PHI SHOT    0   -74 29  74  -29 ... C   16  11   8478439.0  Travis Konecny  R   16  16  32.649655
2020    1   34  PIT SHOT    0   49  -25 49  -25 ... C   34  9   8478542.0   Evan Rodrigues  R   34  34  47.169906
2020    1   65  PHI SHOT    0   -52 -31 52  31  ... L   65  86  8480797.0   Joel Farabee    L   31  31  48.270074
2020    1   171 PIT SHOT    0   43  39  43  39  ... C   42  9   8478542.0   Evan Rodrigues  R   42  42  60.307545   
2020    1   209 PHI MISS    0   -46 33  46  -33 ... D   38  5   8479026.0   Philippe Myers  R   38  38  54.203321

當前代碼：

dft['count'] = df.groupby(['shooterName', 'event'])['event'].agg(['count'])
dft

電流輸出：

shooterName event count
A.J. Greer  GOAL    1
            MISS    6
            SHOT    29
Aaron Downey    GOAL    1
                MISS    4
                SHOT    35

Zenon Konopka   GOAL    8
                MISS    57
                SHOT    176

期望輸出：

shooterName event count %totalshooterNameevents
A.J. Greer  GOAL    1   .0277
            MISS    6   .1666
            SHOT    29  .805

Aaron Downey    GOAL    1 .025
                MISS    4 .1
                SHOT    35 .875

Zenon Konopka   GOAL    8 .0331
                MISS    57 .236
                SHOT    176 .7302

類似的東西。我的最終目標是能夠通過“shooterName”計算每個“事件”屬性占“事件”總數的百分比。下面我添加了一個列“%totalshooterNameevents”，它是“簡單的目標”、“射門”和“未命中”，由每個“射手名”的“目標、射門和未命中”之和計算

uj5u.com熱心網友回復：

更新

嘗試：

dft = df.groupby(['shooterName', 'event'])['event'].agg(['count']).reset_index()
dft['%total'] = dft.groupby('shooterName')['count'].apply(lambda x: x / sum(x))
print(dft)

# Output
     shooterName event  count    %total
0     A.J. Greer  GOAL      1  0.027778
1     A.J. Greer  MISS      6  0.166667
2     A.J. Greer  SHOT     29  0.805556
3   Aaron Downey  GOAL      1  0.025000
4   Aaron Downey  MISS      4  0.100000
5   Aaron Downey  SHOT     35  0.875000
6  Zenon Konopka  GOAL      8  0.033195
7  Zenon Konopka  MISS     57  0.236515
8  Zenon Konopka  SHOT    176  0.730290

沒有樣本，很難猜測你想要什么。嘗試：

import pandas as pd
import numpy as np

# Setup a Minimal Reproducible Example
np.random.seed(2021)
df = pd.DataFrame({'shooterName': np.random.choice(list('AB'), 20),
                   'event': np.random.choice(['shot', 'goal', 'miss'], 20)})

# Create an empty dataframe?
dft = pd.DataFrame(index=df['shooterName'].unique())

# Do stuff
grp = df.groupby('shooterName')
dft['count'] = grp.count()
dft = dft.join(grp['event'].value_counts().unstack('event')
                           .div(dft['count'], axis=0))

輸出：

>>> dft
   count      goal   miss      shot
A     12  0.416667  0.250  0.333333
B      8  0.500000  0.375  0.125000

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/397426.html

標籤：蟒蛇-3.x 熊猫数据框麻木的通过...分组

上一篇：室友打了一把端游，我入門了Vue

下一篇：嘗試根據整數串列對np.ndarray串列進行排序時出現numpyValueError