我有一個像這樣的資料框:
timestamp open high low close volume trade_count vwap symbol
0 2021-10-28 13:30:00 00:00 8.32 8.37 8.310 8.355 16981 26 8.322444 BVN
1 2021-10-28 13:31:00 00:00 8.35 8.35 8.325 8.335 676 11 8.340991 BVN
2 2021-10-28 13:32:00 00:00 8.32 8.33 8.290 8.310 4458 22 8.306497 BVN
3 2021-10-28 13:33:00 00:00 8.34 8.40 8.330 8.400 13025 59 8.348660 BVN
4 2021-10-28 13:34:00 00:00 8.38 8.40 8.320 8.400 5405 29 8.353166 BVN
... ... ... ... ... ... ... ... ... ...
395837 2022-10-10 20:22:00 00:00 129.32 129.32 129.320 129.320 9630 3 129.319618 WMT
395838 2022-10-10 20:33:00 00:00 129.75 129.75 129.750 129.750 215 4 129.736047 WMT
395839 2022-10-10 20:46:00 00:00 129.13 129.13 129.130 129.130 100 1 129.130000 WMT
395840 2022-10-10 22:16:00 00:00 129.13 129.13 129.130 129.130 100 1 129.130000 WMT
395841 2022-10-10 23:41:00 00:00 129.93 129.93 129.930 129.930 200 1 129.930000 WMT
[395842 rows x 9 columns]
我將時間戳列設為日期時間 obj:
df['timestamp'] = pd.to_datetime(df['timestamp'])
然后按個別日期對分鐘柱進行分組:
seperate_days = df.groupby(['symbol', df['timestamp'].dt.date])
但是當我遍歷資料并列印索引時:
for name, ohlc in seperate_days:
ohlc.set_index('timestamp')
ohlc.index = pd.to_datetime(ohlc.index)
ohlc.index = ohlc.index.tz_localize(tz='America/New_York')
print(ohlc.index)
我得到的只是 1970 年代的日期:
DatetimeIndex(['1970-01-01 00:00:00.000171684-05:00',
'1970-01-01 00:00:00.000171685-05:00',
'1970-01-01 00:00:00.000171686-05:00',
'1970-01-01 00:00:00.000171687-05:00',
'1970-01-01 00:00:00.000171688-05:00',
'1970-01-01 00:00:00.000171689-05:00',
'1970-01-01 00:00:00.000171690-05:00',
'1970-01-01 00:00:00.000171691-05:00',
'1970-01-01 00:00:00.000171692-05:00',
'1970-01-01 00:00:00.000171693-05:00',
...
'1970-01-01 00:00:00.000172203-05:00',
'1970-01-01 00:00:00.000172204-05:00',
'1970-01-01 00:00:00.000172205-05:00',
'1970-01-01 00:00:00.000172206-05:00',
'1970-01-01 00:00:00.000172207-05:00',
'1970-01-01 00:00:00.000172208-05:00',
'1970-01-01 00:00:00.000172209-05:00',
'1970-01-01 00:00:00.000172210-05:00',
'1970-01-01 00:00:00.000172211-05:00',
'1970-01-01 00:00:00.000172212-05:00'],
dtype='datetime64[ns, America/New_York]', length=529, freq=None)
我怎樣才能讓日期正常作業?
uj5u.com熱心網友回復:
在您的示例代碼中, ohlc.index 是資料幀索引(即:0,1,2,...),因此您將獲得這些值的 pd.to_datetime() 。
uj5u.com熱心網友回復:
我是這樣做的:
df = pd.read_sql_query("SELECT * from ohlc_minutes", conn)
df['timestamp'] = pd.to_datetime(df['timestamp']) # re
startTime = datetime.datetime.now()
# group the minute bars by individual days
seperate_days = df.groupby(['symbol', df['timestamp'].dt.date])
for name, ohlc in seperate_days:
ohlc.index = ohlc['timestamp']
open_hours_indices = ohlc.index.indexer_between_time('04:00', '20:00')
open_hours = ohlc.index[open_hours_indices]
display(open_hours)
給予:
DatetimeIndex(['2022-10-07 10:18:00 00:00', '2022-10-07 11:34:00 00:00',
'2022-10-07 12:21:00 00:00', '2022-10-07 12:22:00 00:00',
'2022-10-07 12:30:00 00:00', '2022-10-07 12:35:00 00:00',
'2022-10-07 12:47:00 00:00', '2022-10-07 12:48:00 00:00',
'2022-10-07 12:49:00 00:00', '2022-10-07 12:50:00 00:00',
...
'2022-10-07 19:51:00 00:00', '2022-10-07 19:52:00 00:00',
'2022-10-07 19:53:00 00:00', '2022-10-07 19:54:00 00:00',
'2022-10-07 19:55:00 00:00', '2022-10-07 19:56:00 00:00',
'2022-10-07 19:57:00 00:00', '2022-10-07 19:58:00 00:00',
'2022-10-07 19:59:00 00:00', '2022-10-07 20:00:00 00:00'],
dtype='datetime64[ns, UTC]', name='timestamp', length=410, freq=None)
DatetimeIndex(['2022-10-10 11:00:00 00:00', '2022-10-10 11:43:00 00:00',
'2022-10-10 11:52:00 00:00', '2022-10-10 12:14:00 00:00',
'2022-10-10 13:20:00 00:00', '2022-10-10 13:23:00 00:00',
'2022-10-10 13:25:00 00:00', '2022-10-10 13:28:00 00:00',
'2022-10-10 13:30:00 00:00', '2022-10-10 13:31:00 00:00',
...
'2022-10-10 19:51:00 00:00', '2022-10-10 19:52:00 00:00',
'2022-10-10 19:53:00 00:00', '2022-10-10 19:54:00 00:00',
'2022-10-10 19:55:00 00:00', '2022-10-10 19:56:00 00:00',
'2022-10-10 19:57:00 00:00', '2022-10-10 19:58:00 00:00',
'2022-10-10 19:59:00 00:00', '2022-10-10 20:00:00 00:00'],
dtype='datetime64[ns, UTC]', name='timestamp', length=399, freq=None)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/521122.html
標籤:Python熊猫麻木的
上一篇:陣列乘法“IndexError:索引7超出軸0的范圍,大小為7”Python
下一篇:__rand__跨np陣列維度
