我有一個資料集,其中包含給定日期某個位置的交通指數。對于給定的日期,我想計算給定日期前 30 天的所有交通指數的平均值,如果當天不是假期,則只考慮這 30 天子集中的天數。
我想使用 python 進行這個計算。我在下面有一個螢屏截圖,它直觀地代表了我的要求。
Explanation of the screenshot
On April 1, 2019:
I want to calculate the 30 Day Non-Holiday traffic Index Average,
for a given location and map it to a new column with a similar column name.
The column weekend_holiday is a boolean column that is true (1) for days that are public holidays or weekends.
We must ignore such entries in the computation of Average Location's Traffic index.
示例資料集鏈接:
請建議 python pandas 技巧來實作這個結果。
uj5u.com熱心網友回復:
您可以使用 pandas 的滾動計算滾動平均值,它接受基于時間長度的視窗。
以下代碼計算資料幀每一行所需的平均值:
# Set date as index because it is needed if you want to do time-based rolling
df.Date = pd.to_datetime(df.Date)
df = df.set_index('Date')
# Drop weekends/holidays and then compute the average of the previous 30 days
df['DELHI'] = df.where(df.weekend_or_holiday == 0).rolling('30D').mean()['New Delhi']
df['MUMBAI'] = df.where(df.weekend_or_holiday == 0).rolling('30D').mean()['Mumbai']
# Get back Date column
df = df.reset_index()
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/343390.html
