我有一個 Pandas DataFrameGroupBy (df_groups),我通過將另一個包含出版物串列的資料框 (df_pub) 按日/月/年索引分組來創建它。
df_groups = df_pub.groupby(by=df_pub.index.day, df_pub.index.month,df_pub.index.year],sort=False)
然后我想檢查每個組中存在多少獨特的出版物,所以我使用:
n_unique_pub = df_groups.Title.nunique()
這是一個帶有 MutiIndex 的 Pandas 系列,如下所示:
MultiIndex([( 1, 7, 2020),
( 2, 7, 2020),
( 3, 7, 2020),
( 4, 7, 2020),
( 5, 7, 2020),
( 6, 7, 2020),
( 7, 7, 2020),
( 8, 7, 2020),
( 9, 7, 2020),
(10, 7, 2020),
...
( 8, 11, 2021),
( 9, 11, 2021),
(10, 11, 2021),
(11, 11, 2021),
(12, 11, 2021),
(13, 11, 2021),
(14, 11, 2021),
(15, 11, 2021),
(16, 11, 2021),
(17, 11, 2021)],
names=['Date', 'Date', 'Date'], length=497)
我想將此 MultiIndex 轉換為 DatetimeIndex ,使其看起來像:
DatetimeIndex(['2020-07-01', '2020-07-02', '2020-07-03', '2020-07-04',
'2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08',
'2020-07-09', '2020-07-10',
...
'2021-11-08', '2021-11-09', '2021-11-10', '2021-11-11',
'2021-11-12', '2021-11-13', '2021-11-14', '2021-11-15',
'2021-11-16', '2021-11-17'],
dtype='datetime64[ns]', name='Date', length=505, freq='D')
有沒有簡單的方法來做到這一點?到目前為止,我已經嘗試了幾種方法,但沒有任何效果。例如,如果我這樣做,pd.to_datetime(n_unique_pub.index)我有一個錯誤:TypeError: <class 'tuple'> is not convertible to datetime.
uj5u.com熱心網友回復:
使用pd.to_datetime:
# mi is your MultiIndex instance, like mi = df.index
>>> pd.DatetimeIndex(pd.to_datetime(mi.rename(['day', 'month', 'year']).to_frame()))
DatetimeIndex(['2020-07-01', '2020-07-02', '2020-07-03', '2020-07-04',
'2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08',
'2020-07-09', '2020-07-10', '2021-11-08', '2021-11-09',
'2021-11-10', '2021-11-11', '2021-11-12', '2021-11-13',
'2021-11-14', '2021-11-15', '2021-11-16', '2021-11-17'],
dtype='datetime64[ns]', freq=None)
如何將 MultiIndex 替換為 DatetimeIndex:
idx = pd.to_datetime(df.index.rename(['day', 'month', 'year']).to_frame())
df = df.set_index(idx)
print(df)
# Output:
A
2020-07-01 0.961038
2020-07-02 0.098132
2020-07-03 0.406996
2020-07-04 0.008376
2020-07-05 0.568059
2020-07-06 0.576610
2020-07-07 0.137144
2020-07-08 0.672219
2020-07-09 0.142874
2020-07-10 0.509231
2021-11-08 0.368762
2021-11-09 0.249107
2021-11-10 0.136282
2021-11-11 0.119291
2021-11-12 0.052388
2021-11-13 0.434899
2021-11-14 0.770705
2021-11-15 0.850914
2021-11-16 0.621283
2021-11-17 0.379888
uj5u.com熱心網友回復:
您可以先轉換為 'YYYY-MM-DD' 格式:
idx = pd.MultiIndex.from_tuples(
[( 1, 7, 2020),
( 2, 7, 2020),]
)
pd.to_datetime(idx.map(lambda x: '-'.join(map(str, reversed(x)))))
輸出:
DatetimeIndex(['2020-07-01', '2020-07-02'], dtype='datetime64[ns]', freq=None)
uj5u.com熱心網友回復:
像應該作業
dates = pd.to_datetime(df_groups.reset_index()[['year','month','day']])
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/392040.html
