我在 R 中有一個 csv 檔案檔案夾,需要根據檔案名中的資訊在列中回圈、清理和創建。我正在嘗試使用 purr,這就是我迄今為止所做的。
# get file names
files_names <- list.files("data/", recursive = TRUE, full.names = TRUE)
# inspect
files_names
[1] "data/BOC_All_ATMImage_(Aug 2020).txt" "data/BOC_All_ATMImage_(Aug 2021).txt" "data/BOC_All_ATMImage_(Feb 2021).txt"
[4] "data/BOC_All_ATMImage_(May 2021).txt" "data/BOC_All_ATMImage_(Nov 2020).txt" "data/BOC_All_ATMImage_(Nov 2021).txt"
# extract month/year inside brackets and convert to snakecase
# this will be used later to create column names
names_data <- files_names %>%
str_extract(., "(?<=\\().*?(?=\\))") %>%
str_to_lower() %>%
str_replace(., " ", "_")
column_names
[1] "aug_2020" "aug_2021" "feb_2021" "may_2021" "nov_2020" "nov_2021"
現在回圈遍歷 csv,讀取每個 csv,進行一些資料清理并創建列
mc_data <-
map(files_names,
~ read_csv(.x, guess_max = 50000) %>%
janitor::clean_names() %>%
mutate(month_year = str_extract(.x, "(?<=\\().*?(?=\\))"),
date_dmy = paste0(day, "-", month_year),
date = dmy(date_dmy),
fsa = str_sub(postal_code, start = 1, end=3),
?? = 1) %>%
select(-date_dmy),
.id = "group"
)
我需要再改變一列,并且該列必須根據names_data提取的內容命名。我目前??在上面的假代碼中有這個。names_data遵循與檔案路徑相同的順序,因此我們的想法是在一個回圈中完成并在清理后保存每個資料。
uj5u.com熱心網友回復:
我們可以使用粘合語法和map2. 也許:
mc_data <-
map2(files_names, column_names,
~ read_csv(.x, guess_max = 50000) %>%
janitor::clean_names() %>%
mutate(month_year = str_extract(.x, "(?<=\\().*?(?=\\))"),
date_dmy = paste0(day, "-", month_year),
date = dmy(date_dmy),
fsa = str_sub(postal_code, start = 1, end=3),
'{.y}' := 1) %>%
select(-date_dmy),
.id = "group"
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/383089.html
上一篇:使用 WPF 做個 PowerPoint 系列 基于 OpenXML 決議實作 PPT 文本描邊效果
下一篇:如何將資料框列拆分為兩列
