我有一個非常簡單的 CSV,我試圖將其匯入 clickhouse,但沒有成功。創建表陳述句為:
CREATE TABLE staging.EloLBK
(
`Month` DateTime64(3),
`1958` Int32,
`1959` Int32,
`1960` Int32
)
ENGINE = MergeTree
PRIMARY KEY Month
ORDER BY Month
SETTINGS index_granularity = 8192
CSV 資料如下所示:
"Month", "1958", "1959", "1960"
"JAN", 340, 360, 417
"FEB", 318, 342, 391
"MAR", 362, 406, 419
"APR", 348, 396, 461
"MAY", 363, 420, 472
"JUN", 435, 472, 535
"JUL", 491, 548, 622
"AUG", 505, 559, 606
"SEP", 404, 463, 508
"OCT", 359, 407, 461
"NOV", 310, 362, 390
"DEC", 337, 405, 432
我的進口宣告是:
INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer')
從 clickhouse 回傳的錯誤是:
Code: 27. DB::Exception: Cannot parse input: expected '"' before: '417\n"FEB", 318, 342, 391\n"MAR", 362, 406, 419\n"APR", 348, 396, 461\n"MAY", 363, 420, 472\n"JUN", 435, 472, 535\n"JUL", 491, 548, 622\n"AUG", 505,':
Row 1:
Column 0, name: Month, type: DateTime64(3), parsed text: "<DOUBLE QUOTE>JAN<DOUBLE QUOTE>, 340, 360, "ERROR
Code: 27. DB::ParsingException: Cannot parse input: expected '"' before: '417\n"FEB", 318, 342, 391\n"MAR", 362, 406, 419\n"APR", 348, 396, 461\n"MAY", 363, 420, 472\n"JUN", 435, 472, 535\n"JUL", 491, 548, 622\n"AUG", 505,'. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) (version 21.12.1.8928 (official build))
: While executing CSVRowInputFormat: While executing File. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)
我不確定如何解決這個問題,所以任何建議都將不勝感激!
uj5u.com熱心網友回復:
你可以通過傳遞 option 來告訴 ClickHouse 做一個盡力猜測date_time_input_format='best_effort',例如:
INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer') settings date_time_input_format='best_effort';
會導致:
┌───────────────────Month─┬─1958─┬─1959─┬─1960─┐
│ 2000-01-01 00:00:00.000 │ 340 │ 360 │ 417 │
│ 2000-02-01 00:00:00.000 │ 318 │ 342 │ 391 │
│ 2000-03-01 00:00:00.000 │ 362 │ 406 │ 419 │
│ 2000-04-01 00:00:00.000 │ 348 │ 396 │ 461 │
│ 2000-05-01 00:00:00.000 │ 363 │ 420 │ 472 │
│ 2000-06-01 00:00:00.000 │ 435 │ 472 │ 535 │
│ 2000-07-01 00:00:00.000 │ 491 │ 548 │ 622 │
│ 2000-08-01 00:00:00.000 │ 505 │ 559 │ 606 │
│ 2000-09-01 00:00:00.000 │ 404 │ 463 │ 508 │
│ 2000-10-01 00:00:00.000 │ 359 │ 407 │ 461 │
│ 2000-11-01 00:00:00.000 │ 310 │ 362 │ 390 │
│ 2000-12-01 00:00:00.000 │ 337 │ 405 │ 432 │
└─────────────────────────┴──────┴──────┴──────┘
uj5u.com熱心網友回復:
好的,經過一番混亂之后,錯誤訊息似乎有點誤導。實際問題是 clickhouse (可以理解)無法將月份決議為日期時間。
以下 CSV 輸入作業正常:
"Month","1958","1959","1960"
"1970-01-01T00:00:00","340","360","417"
"1970-01-02T00:00:00","318","342","391"
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/461484.html
上一篇:用python清理csv檔案
