我有一個資料集如下,(資料行比下面提到的多)
Date Calls
0 2022-01-02, Sunday 482920
1 2022-01-01, Saturday 482920
2 2021-12-31, Friday 482920
3 2021-12-30, Thursday 482920
4 2021-12-29, Wednesday 519995
5 2021-12-28, Tuesday 482920
6 2021-12-27, Monday 519995
7 2021-12-26, Sunday 522273
8 2021-12-25, Saturday 508439
9 2021-12-24, Friday 456587
10 2021-12-23, Thursday 482920
11 2021-12-22, Wednesday 519995
12 2021-12-21, Tuesday 522273
13 2021-12-20, Monday 508439
14 2021-12-19, Sunday 456587
我需要獲取資料集中提到的每個相同日期的平均值。例如,我需要獲取所有“星期日”呼叫的平均值。所以我需要另一列平均呼叫如下。
Date Calls Avgerage_Calls
0 2022-01-02, Sunday 482920 487260.0
1 2022-01-01, Saturday 482920 495679.5
2 2021-12-31, Friday 482920 469753.5
3 2021-12-30, Thursday 482920 482920.0
4 2021-12-29, Wednesday 519995 519995.0
5 2021-12-28, Tuesday 482920 469753.5
6 2021-12-27, Monday 519995 469753.5
7 2021-12-26, Sunday 522273 487260.0
8 2021-12-25, Saturday 508439 495679.5
9 2021-12-24, Friday 456587 469753.5
10 2021-12-23, Thursday 482920 482920.0
11 2021-12-22, Wednesday 519995 519995.0
12 2021-12-21, Tuesday 522273 469753.5
13 2021-12-20, Monday 508439 469753.5
14 2021-12-19, Sunday 456587 487260.0
所以到目前為止,我已經使用這些步驟來實作這一點。
df_new = df[df['Date'].str.contains('Sunday', regex=False, case=False, na=False)]
x=df_new["Calls"].mean()
x
它提供同一日期的平均值。但是可能有一些直接的方法可以在沒有不同資料幀的情況下獲得所有這些平均值。有人可以幫我解決這個問題嗎?
uj5u.com熱心網友回復:
這是一個解決方案:
df = pd.DataFrame({'Date': ['2022-01-02, Sunday', '2022-01-01, Saturday', '2021-12-31, Friday', '2021-12-30, Thursday', '2021-12-29, Wednesday',
'2021-12-28, Tuesday', '2021-12-27, Monday', '2021-12-26, Sunday','2021-12-25, Saturday','2021-12-24, Friday','2021-12-23, Thursday',
'2021-12-22, Wednesday','2021-12-21, Tuesday','2021-12-20, Monday','2021-12-19, Sunday'],
'Calls': [482920,482920,482920,482920,519995,482920,519995,522273,508439,456587,482920,519995,522273,508439,456587]})
df['day'] = df['Date'].apply(lambda x : x.split(',')[1].strip())
df['Avgerage_Calls'] = df.groupby(df['day'])['Calls'].transform('mean')
df.drop(columns=['day'],inplace = True)
輸出:
Date Calls Avgerage_Calls
0 2022-01-02, Sunday 482920 487260.0
1 2022-01-01, Saturday 482920 495679.5
2 2021-12-31, Friday 482920 469753.5
3 2021-12-30, Thursday 482920 482920.0
4 2021-12-29, Wednesday 519995 519995.0
5 2021-12-28, Tuesday 482920 502596.5
6 2021-12-27, Monday 519995 514217.0
7 2021-12-26, Sunday 522273 487260.0
8 2021-12-25, Saturday 508439 495679.5
9 2021-12-24, Friday 456587 469753.5
10 2021-12-23, Thursday 482920 482920.0
11 2021-12-22, Wednesday 519995 519995.0
12 2021-12-21, Tuesday 522273 502596.5
13 2021-12-20, Monday 508439 514217.0
14 2021-12-19, Sunday 456587 487260.0
我創建了一個day包含日期的新列,然后我按天計算了平均值(使用 groupby)。str.split(',') 用于分割日期。例如,如果s= '2021-12-20, Monday'然后s.split(',')給出['2021-12-20', ' Monday']. str.strip() 用于洗掉空格。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/392720.html
