我有一個assets.csv172 MB、一百萬行和 16 列的檔案。我想使用offset -> bytes/line/record. 在下面的代碼中,我使用的是位元組值。
我已經存盤了所需的位置 ( record.postion.bytes()in assets_index.csv),我想assets.csv使用保存的偏移量讀取特定的行。
我能夠得到一個輸出,但我覺得必須有更好的方法來CSV根據位元組位置從檔案中讀取。
請指教。我是編程新手,也是 Rust 新手,使用教程學到了很多東西。
assets.csv是這種格式:
asset_id,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation,year,depreciation
1000001,2015,10000,2016,10000,2017,10000,2018,10000,2019,10000,2020,10000,2021,10000,2022,10000,2023,10000,2024,10000,2025,10000,2026,10000,2027,10000,2028,10000,2029,10000
我使用另一個函式來獲取Position { byte: 172999933, line: 1000000, record: 999999 }.
assets_index.csv是這種格式:
asset_id,offset_inbytes
1999999,172999933
fn read_from_position() -> Result<(), Box<dyn Error>> {
let asset_pos = 172999933 as u64;
let file_path = "assets.csv";
let mut rdr = csv::ReaderBuilder::new()
.flexible(true)
.from_path(file_path)?;
let mut wtr = csv::Writer::from_writer(io::stdout());
let mut record = csv::ByteRecord::new();
while rdr.read_byte_record(&mut record)? {
let pos = &record.position().expect("position of record");
if pos.byte() == asset_pos
{
wtr.write_record(&record)?;
break;
}
}
wtr.flush()?;
Ok(())
}
$ time ./target/release/testcsv
1999999,2015,10000,2016,10000,2017,10000,2018,10000,2019,10000,2020,10000,2021,10000,2022,10000,2023,10000,2024,10000,2025,10000,2026,10000,2027,10000,2028,10000,2029,10000
Time elapsed in readcsv() is: 239.290125ms
./target/release/testcsv 0.22s user 0.02s system 99% cpu 0.245 total
uj5u.com熱心網友回復:
from_path在創建之前,您可以使用from_readerwith aFile并在該檔案中查找,而不是使用CsvReader:
use std::{error::Error, fs, io::{self, Seek}};
fn read_from_position() -> Result<(), Box<dyn Error>> {
let asset_pos = 0x116 as u64; // offset to only record in example
let file_path = "assets.csv";
let mut f = fs::File::open(file_path)?;
f.seek(io::SeekFrom::Start(asset_pos))?;
let mut rdr = csv::ReaderBuilder::new()
.flexible(true)
// edit: as noted by @BurntSushi5 we have to disable headers here.
.has_headers(false)
.from_reader(f);
let mut wtr = csv::Writer::from_writer(io::stdout());
let mut record = csv::ByteRecord::new();
rdr.read_byte_record(&mut record)?;
wtr.write_record(&record)?;
wtr.flush()?;
Ok(())
}
然后讀取的第一個記錄將是您要查找的記錄。
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/534880.html
標籤:格式文件锈抵消
