如果在另一列中滿足條件,我必須將給定的日期時間值分配timestamp到具有 NaT 值的行的列中。中的所有值timestamp都是datetime64[ns]或NaT。
編輯:
樣本資料:
dates = [pd.to_datetime('2022-10-14 10:13:52', format = "%Y-%m-%d %H:%M:%S"),
pd.to_datetime('2022-10-14 17:43:52', format = "%Y-%m-%d %H:%M:%S"),
pd.to_datetime('2022-10-14 09:00:10', format = "%Y-%m-%d %H:%M:%S")]
data = {'A': [-0.5, -0.5, 0.7, 1, 0.65, 0.5], 'timestamp': pd.Series(dates, index=[1, 3, 5])}
df = pd.DataFrame(data = data, index=[0, 1, 2, 3, 4, 5])
輸出:
A timestamp
0 -0.50 NaT
1 -0.50 2022-10-14 10:13:52
2 0.70 NaT
3 1.00 2022-10-14 17:43:52
4 0.65 NaT
5 0.50 2022-10-14 09:00:10
然后我執行以下操作:
threshold = 0.65
null_date = pd.to_datetime('2022-09-01 09:00:00', format = "%Y-%m-%d %H:%M:%S")
df.timestamp = np.where(df.A >= threshold, null_date, df.timestamp)
但是,這會將所有值timestamp轉換為物件型別。
A timestamp
0 -0.50 None
1 -0.50 1665742432000000000
2 0.70 2022-09-01 09:00:00
3 1.00 2022-09-01 09:00:00
4 0.65 2022-09-01 09:00:00
5 0.50 1665738010000000000
也就是說,NaTs將不滿足條件的行替換為None. Datetime在這些行中也被替換。只有滿足條件的行才會獲得datetime.
有沒有人有任何建議如何按條件用給定的日期時間替換 NaT?
編輯-2:
它使用 lambda 函式解決了:
df.timestamp = df[['A', 'timestamp']].apply(lambda x: null_date if x['A'] >= threshold else x['timestamp'], axis=1)
輸出:
A timestamp
0 -0.50 NaT
1 -0.50 2022-10-14 10:13:52
2 0.70 2022-09-01 09:00:00
3 1.00 2022-09-01 09:00:00
4 0.65 2022-09-01 09:00:00
5 0.50 2022-10-14 09:00:10
uj5u.com熱心網友回復:
null_date值的型別與 df['timestamp'] 值的型別不匹配。兩者都必須是 datetime64。用這個:
threshold = 0.65
null_date = pd.to_datetime('2022-09-01 09:00:00', format = "%Y-%m-%d %H:%M:%S")
null_date = np.datetime64(null_date)
df['timestamp'] = np.where(df['A'] >= threshold, null_date, df['timestamp'])
uj5u.com熱心網友回復:
我認為這會起作用:
import pandas as pd
dates = [pd.to_datetime('2022-10-14 10:13:52', format = "%Y-%m-%d %H:%M:%S"),
pd.to_datetime('2022-10-14 17:43:52', format = "%Y-%m-%d %H:%M:%S"),
pd.to_datetime('2022-10-14 09:00:10', format = "%Y-%m-%d %H:%M:%S")]
data = {'A': [-0.5, -0.5, 0.7, 1, 0.65, 0.5], 'timestamp': pd.Series(dates, index=[1, 3, 5])}
df = pd.DataFrame(data = data, index=[0, 1, 2, 3, 4, 5])
threshold = 0.65
null_date = pd.to_datetime('2022-09-01 09:00:00', format = "%Y-%m-%d %H:%M:%S")
#df.timestamp = np.where(df.A >= threshold, null_date, df.timestamp)
df.loc[df.A >= threshold, 'timestamp'] = null_date
>>> df
A timestamp
0 -0.50 NaT
1 -0.50 2022-10-14 10:13:52
2 0.70 2022-09-01 09:00:00
3 1.00 2022-09-01 09:00:00
4 0.65 2022-09-01 09:00:00
5 0.50 2022-10-14 09:00:10
>>>
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/521391.html
標籤:Python熊猫约会时间
