我目前正在處理一個名為的資料框,該資料框df首先需要將其過濾為df1基于dateand的新資料框num_posts column,下一步是設定過濾后的值num_posts column to 10和增量。date column by one monthdf1
過濾的邏輯df是df1:
如果date == today & num_posts == 4.
更新上述選定列后的最后一步是使用這行代碼df1進行更新dfdf1df.update(df1)
什么在起作用。我能夠根據過濾df后df1的邏輯和更新date and num_posts列進行過濾df1。
什么不起作用當我嘗試從 更新df時df1,它只更新date column,但不能更新num_posts值。
我的代碼:
import pandas as pd
import datetime
df = pd.DataFrame({'num_posts': [5, 4, 4, 4, 1, 14],
'date': ['2022-06-10', '2022-06-14',
'2022-06-14', '2020-09-12',
'2020-09-29', '2020-10-15'],
'user': ['user4', 'user1', 'user1', 'user3', 'user4', 'user4']})
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d')
# Logic one start
# Get current date
new_date = datetime.datetime.now()
current_date = new_date.strftime("%Y-%m-%d")
# filter posts that equal 4 and date equal today
df1 = df.loc[(df['num_posts'] == 4) & (df['date'] == current_date)].copy()
# # overwrite the num_posts column with 10
df1.loc[df1['num_posts'] == 4, 'num_posts'] = 10
# df1.replace({'num_posts': {4: 10}}, inplace=True)
# Increment date by one month
plus_month_period = 1
df1 = df1['date'] pd.DateOffset(months=plus_month_period)
df1
# updating the old dataframe
df.update(df1)
df
當我運行我的代碼時,我得到以下輸出,這不是預期的輸出。
num_posts date user
0 5 2022-06-10 user4
1 4 2022-07-14 user1
2 4 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
一旦我在上面運行我的代碼(手動修改它),我期望的輸出。
num_posts date user
0 5 2022-06-10 user4
1 10 2022-07-14 user1
2 10 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
我究竟做錯了什么?
uj5u.com熱心網友回復:
分配列date:
df1['date'] = df1['date'] pd.DateOffset(months=plus_month_period)
或者:
df1['date'] = pd.DateOffset(months=plus_month_period)
df.update(df1)
print (df)
num_posts date user
0 5.0 2022-06-10 user4
1 10.0 2022-07-14 user1
2 10.0 2022-07-14 user1
3 4.0 2020-09-12 user3
4 1.0 2020-09-29 user4
5 14.0 2020-10-15 user4
替代解決方案是通過掩碼分配給原始 DataFrame 和DataFrame.loc:
new_date = datetime.datetime.now()
current_date = new_date.strftime("%Y-%m-%d")
plus_month_period = 1
m = (df['num_posts'] == 4) & (df['date'] == current_date)
df.loc[m, 'num_posts'] = 10
df.loc[m, 'date'] = pd.DateOffset(months=plus_month_period)
print (df)
num_posts date user
0 5 2022-06-10 user4
1 10 2022-07-14 user1
2 10 2022-07-14 user1
3 4 2020-09-12 user3
4 1 2020-09-29 user4
5 14 2020-10-15 user4
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/490541.html
上一篇:轉置和Groupby熊貓列
下一篇:Python-Pandas,Datetime-將超過3個日期列的日期時間條目轉換為月份值(即2022-05-26到5(5月)的月份值)
