我正在嘗試使用strsplit將兩個變數的有序字串拆分為資料集中的行。每個有序字串由 分隔,,但我有點困惑,沒有在 SO 上找到任何類似的問題。
不確定我的解釋是否正確,請參閱以下示例資料:
df <- data.frame(suburb = c("yellow, blue", "orange, yellow", "blue", "green, yellow"), postcode = c("a9, b9", "c9, a9", "b9", "d9, a9"))
我最理想的情況是
suburb postcode
yellow a9
blue b9
orange c9
yellow a9
blue b9
green d9
yellow a9
uj5u.com熱心網友回復:
df <-
data.frame(
suburb = c("yellow, blue", "orange, yellow", "blue", "green, yellow"),
postcode = c("a9, b9", "c9, a9", "b9", "d9, a9")
)
library(data.table)
setDT(df)[, lapply(.SD, function(x) unlist(strsplit(x, split = ",")))]
#> suburb postcode
#> 1: yellow a9
#> 2: blue b9
#> 3: orange c9
#> 4: yellow a9
#> 5: blue b9
#> 6: green d9
#> 7: yellow a9
由reprex 包于 2022-06-15 創建(v2.0.1)
uj5u.com熱心網友回復:
tidyr::separate_rows(df, suburb, postcode)
# # A tibble: 7 × 2
# suburb postcode
# <chr> <chr>
# 1 yellow a9
# 2 blue b9
# 3 orange c9
# 4 yellow a9
# 5 blue b9
# 6 green d9
# 7 yellow a9
uj5u.com熱心網友回復:
在base R中,您可以使用strsplit然后unlist轉換為資料框:
cbind.data.frame(
suburb = unlist(strsplit(df$suburb, ", ")),
postcode = unlist(strsplit(df$postcode, ", "))
)
uj5u.com熱心網友回復:
您可以使用separate_rows:
library(tidyr)
df %>%
# split values into separate rows:
separate_rows(c(suburb, postcode), sep = ",") %>%
# clean up trailing and leading spaces:
mutate(across(everything(), ~sub("\\s\\s?", "", .)))
# A tibble: 7 × 2
suburb postcode
<chr> <chr>
1 yellow a9
2 blue b9
3 orange c9
4 yellow a9
5 blue b9
6 green d9
7 yellow a9
uj5u.com熱心網友回復:
另一種可能的解決方案:
library(tidyverse)
map_dfr(df, ~ unlist(str_split(.x, ",\\s*")))
#> # A tibble: 7 × 2
#> suburb postcode
#> <chr> <chr>
#> 1 yellow a9
#> 2 blue b9
#> 3 orange c9
#> 4 yellow a9
#> 5 blue b9
#> 6 green d9
#> 7 yellow a9
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/491556.html
