我正在嘗試使用從 Ninja Trader 下載的資料檔案分析期貨資料。我使用 PyCharm IDE 在 Python 中匯入了該檔案。
文本檔案格式是這樣的:
20211031 220000 0000000;4608;4608;4608;1
20211031 220000 0000000;4608;4608;4608;4
20211031 220000 0000000;4608;4608;4608;1
20211031 220000 0000000;4608;4608;4608;1
這些是帶有時間戳、價格、價格、價格、交易量的報價資料。
我使用以下方法匯入它們:
data = pd.read_csv(r"C:\__dir__test.txt",sep=';')
(我已經洗掉了實際的目錄地址)
指定的列名:
data.columns = ["date","P1","P2","P3","V"]
然后我使用 pd.to_datetime 失敗:
data["date"] = pd.to_datetime(data ["date"])
和
data['Dates'] = pd.to_datetime(data['date']).dt.date
data['Time'] = pd.to_datetime(data['date']).dt.time
然后我收到一個錯誤:
File "C:\..dir..venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2187, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
File "pandas\_libs\tslibs\conversion.pyx", line 359, in pandas._libs.tslibs.conversion.datetime_to_datetime64
型別錯誤:無法識別的值型別:<class 'str'>
在處理上述例外的程序中,又發生了一個例外:Traceback(最近一次呼叫last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.2.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.2.1\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents "\n", file, 'exec'), glob, loc)
File "C:/__dir__/Data from NT.py", line 10, in <module>
data["date"] = pd.to_datetime(data ["date"])
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 883, in to_datetime
cache_array = _maybe_cache(arg, format, cache, convert_listlike)
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 195, in _maybe_cache
cache_dates = convert_listlike(unique_dates, format)
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 401, in _convert_listlike_datetimes
result, tz_parsed = objects_to_datetime64ns(
File "C:\__dir__\venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2193, in objects_to_datetime64ns
raise err
File "C:\__dir__\venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2175, in objects_to_datetime64ns result, tz_parsed = tslib.array_to_datetime(
File "pandas\_libs\tslib.pyx", line 379, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 611, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 749, in pandas._libs.tslib._array_to_datetime_object
File "pandas\_libs\tslib.pyx", line 740, in pandas._libs.tslib._array_to_datetime_object
File "pandas\_libs\tslibs\parsing.pyx", line 257, in pandas._libs.tslibs.parsing.parse_datetime_string
File "C:\__dir__\venv\lib\site-packages\dateutil\parser\_parser.py", line 1368, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "C:\__dir__\venv\lib\site-packages\dateutil\parser\_parser.py", line 643, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: 20211031 220000 0000000
目的是分割日期和時間以在特定時間間隔內進行分析(通過對資料進行切片)。我在使用 ib_insync 框架從盈透證券下載的資料集中使用了相同的設定,并且作業正常。
uj5u.com熱心網友回復:
import pandas as pd
df = pd.DataFrame({"date_raw":["20211031 220000 0000000","20211031 220000 0000000", "20211031 220000 0000000", "20211031 220000 0000000"]})
# parse date_raw to datetime field
df["date"] = pd.to_datetime(df["date_raw"],format="%Y%m%d %H%M%S %f")
# get Date and Time fields
df['Date'] = df['date'].dt.date
df['Time'] = df['date'].dt.time
print(df)
結果:
date_raw date Date Time
0 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
1 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
2 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
3 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/364472.html
上一篇:如何離散化日期時間列?
