我有一列具有不同的時間戳格式,如下所示。我想將日期列轉換為可讀日期。由于時間戳單位是混合的,我發現其他轉換正確,而其他默認為 1970。有沒有辦法可以將它們一起轉換或將它們轉換為 unix 時間戳單位,然后再轉換為可讀日期,以便統一發生。
data = ["2022-04-14 17:31:03.023","2022-04-20 12:49:50.295",1647597943249,1647519101441,"2022-03-19 18:10:59.024"]
df = pd.DataFrame(data, columns=['date'])
df['newdate'] = pd.to_datetime(df['date'], unit='ns')
df
date newdate
0 2022-04-14 17:31:03.023 2022-04-14 17:31:03.023000
1 2022-04-20 12:49:50.295 2022-04-20 12:49:50.295000
2 1647597943249 1970-01-01 00:27:27.597943249
3 1647519101441 1970-01-01 00:27:27.519101441
4 2022-03-19 18:10:59.024 2022-03-19 18:10:59.024000
如果我將單位更改為“毫秒”,我會得到
ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()
ValueError: could not convert string to float: '2022-04-14 17:31:03.023'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()
ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'```
uj5u.com熱心網友回復:
想法是用缺失值替換值,如果不可能轉換為日期時間,然后使用Series.fillna:
df['newdate'] = (pd.to_datetime(df['date'], unit='ms', errors='coerce')
.fillna(pd.to_datetime(df['date'], errors='coerce')))
print (df)
date newdate
0 2022-04-14 17:31:03.023 2022-04-14 17:31:03.023
1 2022-04-20 12:49:50.295 2022-04-20 12:49:50.295
2 1647597943249 2022-03-18 10:05:43.249
3 1647519101441 2022-03-17 12:11:41.441
4 2022-03-19 18:10:59.024 2022-03-19 18:10:59.024
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/532282.html
