作為 Google DA 證書分配的一部分,我試圖找到關于如何使用 R 下載解壓縮和合并多個 .csv 檔案的優雅解決方案,但我一遍又一遍地面臨同樣的問題:
從 zip 檔案中提取錯誤 1
資料:來源:Divvy
我運行的代碼是:
## declare variable file names corresponding to calendar months
months <- c(202011:202012,202101:202110)
## declare directory for storing source files
storage <- "C:\\Users\\...\\start"
## vectors of all urls to download from and destination files
urls <-
paste0("https://divvy-tripdata.s3.amazonaws.com/",months, "-divvy-tripdata.zip")
## idea was to download archives into temporary files, unzip contents to 'storage' directory and remove tempdir.
temp <- tempdir()
tempfile <- paste0(temp,"\\",months,".zip")
##Downloading 12 months archives
for(i in seq(urls)){
download.file(urls[i],tempfile[i], mode="wb")
}
file_names <- list.files(temp, pattern = ".zip")
for (i in seq(file_names)){
unzip(file_names,exdir=storage,overwrite = FALSE)}
unzip("file_names", exdir = storage, overwrite = FALSE) 中的警告:從 zip 檔案中提取錯誤 1
一切正常,直到解壓縮步驟。所有檔案都已下載,可以打開,檔案未損壞,屬性顯示擴展名為 .zip
我已經在多臺機器上的不同目錄中嘗試過我的代碼,嘗試手動下載檔案,嘗試使用回圈一次解壓縮每個人,ldply但結果仍然相同。
我花了 3 天的時間試圖解決它并感謝任何幫助:)
uj5u.com熱心網友回復:
使用相同的months, 和urls變數,下面看起來更簡單。請注意將臨時檔案名放在一起的不同方式,file.path.
tmpdir <- tempdir()
tmpfile <- file.path(tmpdir, months)
tmpfile <- paste0(tmpfile, ".zip")
##Downloading 12 months archives
for(i in seq(urls)){
download.file(urls[i], tmpfile[i], mode="wb")
unzip(tmpfile[i], exdir = storage, overwrite = FALSE)
}
unlink(tmfile)
unlink(tmpdir)
list.files(storage, pattern = "\\.csv")
# [1] "202011-divvy-tripdata.csv" "202012-divvy-tripdata.csv"
# [3] "202101-divvy-tripdata.csv" "202102-divvy-tripdata.csv"
# [5] "202103-divvy-tripdata.csv" "202104-divvy-tripdata.csv"
# [7] "202105-divvy-tripdata.csv" "202106-divvy-tripdata.csv"
# [9] "202107-divvy-tripdata.csv" "202108-divvy-tripdata.csv"
#[11] "202109-divvy-tripdata.csv" "202110-divvy-tripdata.csv"
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/368526.html
