我有來自資料幀的以下資料子集。
{'NID': {131598: '215026851',
131599: '215026851',
131600: '215026851',
131601: '215026851',
131602: '215026851',
131603: '215026851',
131604: '215026851',
131605: '215026851',
131606: '215026851'},
'AbCode': {131598: 0,
131599: 0,
131600: 0,
131601: 0,
131602: 0,
131603: 1,
131604: 0,
131605: 0,
131606: 0},
'ABdat': {131598: Timestamp('2018-01-24 00:00:00'),
131599: Timestamp('2019-01-25 00:00:00'),
131600: NaT,
131601: Timestamp('2019-11-08 00:00:00'),
131602: Timestamp('2020-01-24 00:00:00'),
131603: Timestamp('2020-02-15 00:00:00'),
131604: Timestamp('2020-10-16 00:00:00'),
131605: Timestamp('2020-10-26 00:00:00'),
131606: NaT}}
格式化后的資料如下所示
NID AbCode ABdat
131598 215026851 0 2018-01-24
131599 215026851 0 2019-01-25
131600 215026851 0 NaT
131601 215026851 0 2019-11-08
131602 215026851 0 2020-01-24
131603 215026851 1 2020-02-15
131604 215026851 0 2020-10-16
131605 215026851 0 2020-10-26
131606 215026851 0 NaT
對于 AbCode = 0,我想用缺失 (NaT) 替換 ABdat,對于 AbCode = 1,我想用 ABdat-7days 替換 ABdat
我在下面撰寫了以下 np.where 代碼來執行此操作。
breed_info['ABdat'] = np.where(breed_info.AbCode == 1, breed_info['ABdat'] - pd.DateOffset(days=7), breed_info['ABdat'].isnull)
輸出如下所示
NID AbCode ABdat
131598 215026851 0 <bound method Series.isnull of 49017 ...
131599 215026851 0 <bound method Series.isnull of 49017 ...
131600 215026851 0 <bound method Series.isnull of 49017 ...
131601 215026851 0 <bound method Series.isnull of 49017 ...
131602 215026851 0 <bound method Series.isnull of 49017 ...
131603 215026851 1 1581120000000000000
131604 215026851 0 <bound method Series.isnull of 49017 ...
131605 215026851 0 <bound method Series.isnull of 49017 ...
131606 215026851 0 <bound method Series.isnull of 49017 ...
您能否告知為什么日期格式會發生變化以及如何避免這種情況發生?
謝謝
uj5u.com熱心網友回復:
最簡單的是使用帶有熊貓方法的一些熊貓解決方案,例如Series.where:
breed_info['ABdat'] = (breed_info['ABdat'] - pd.DateOffset(days=7))
.where(breed_info.AbCode == 1)
使用np.where帶有助手的 hacky 解決方案Series:
breed_info['ABdat'] = np.where(breed_info.AbCode == 1,
breed_info['ABdat'] - pd.DateOffset(days=7),
pd.Series(pd.NaT, index=breed_info.index))
print (breed_info)
NID AbCode ABdat
131598 215026851 0 NaT
131599 215026851 0 NaT
131600 215026851 0 NaT
131601 215026851 0 NaT
131602 215026851 0 NaT
131603 215026851 1 2020-02-08
131604 215026851 0 NaT
131605 215026851 0 NaT
131606 215026851 0 NaT
因為如果傳遞pd.NAT它回傳下劃線 numpy 陣列(以納秒為單位):
breed_info['ABdat'] = np.where(breed_info.AbCode == 1,
breed_info['ABdat'] - pd.DateOffset(days=7),
pd.NaT)
print (breed_info)
NID AbCode ABdat
131598 215026851 0 NaT
131599 215026851 0 NaT
131600 215026851 0 NaT
131601 215026851 0 NaT
131602 215026851 0 NaT
131603 215026851 1 1581120000000000000
131604 215026851 0 NaT
131605 215026851 0 NaT
131606 215026851 0 NaT
我認為原因是錯誤。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/435896.html
上一篇:Pandas按選定日期分組
下一篇:在Excel中比較日期
