我有一個嵌套的 tibble,我想取消嵌套。兩個串列列(street_address 和 status)包含字符向量和串列,一個串列列(國家/地區)僅包含字符向量。取消嵌套 tibble 時,會產生錯誤,這顯然是因為在具有兩種型別的條目的兩列中同時存在字符向量和串列。
df <- tibble::tribble(
~id, ~country, ~street_address, ~status,
"2008-002231-32-GB", c("United Kingdom", "Netherlands"), list(c(NA, NA)), list(c(NA, NA)),
"2020-001060-28-SE", c("Denmark", "Denmark", "Denmark", "Denmark"), c("Palle Juul Jensens Blvrd 67", "Palle Juul Jensens Boulevard 99", "Palle Juul Jensens Blvrd 67", "Palle Juul Jensens Boulevard 99"), c("Non-Commercial", "Non-Commercial", "Non-Commercial", "Non-Commercial")
)
df
# A tibble: 2 × 4
id country street_address status
<chr> <list> <list> <list>
1 2008-002231-32-GB <chr [2]> <list [1]> <list [1]>
2 2020-001060-28-SE <chr [4]> <chr [4]> <chr [4]>
df %>%
unnest(cols = c(country, street_address, status))
# >Error: Can't combine `..1$street_address` <list> and `..2$street_address` <character>.
由reprex 包(v2.0.1)于 2021 年 12 月 14 日創建
似乎列中存在串列條目是問題所在(全部采用串列(c(NA,NA))的格式)。一種選擇可能是將這些觀察結果更改為字符向量(或將它們設為 NA,因為它們似乎都是 NA),但我無法弄清楚如何做到這一點或是否能解決問題。任何幫助將不勝感激。
注意,這是一個更新的問題,因為我隨問題提交的第一個資料是我使用 dpasta() 生成的,并不能很好地代表我的實際資料。
所需的結果應如下所示:
# A tibble: 8 × 4
id country street_address status
<chr> <chr> <chr> <chr>
1 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commercial
2 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commercial
3 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commercial
4 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commercial
5 2008-002231-32-GB United Kingdom NA NA
6 2008-002231-32-GB Netherlands NA NA
>
``
uj5u.com熱心網友回復:
您可以使用中的unnest函式取消嵌套資料tidyr。代碼如下所示:
library(tidyr)
df %>%
mutate(r = map(street_address, ~data.frame(t(.))), s = map(status, ~data.frame(t(.)))) %>%
unnest(r, s) %>%
select(-street_address, -status)
輸出如下所示:
# A tibble: 6 x 12
id country t... X1 X2 X3 X4 t...1 X11 X21 X31 X41
<chr> <chr> <list> <chr> <chr> <chr> <chr> <lis> <chr> <chr> <chr> <chr>
1 2008-~ United ~ <lgl ~ NA NA NA NA <lgl~ NA NA NA NA
2 2008-~ Netherl~ <lgl ~ NA NA NA NA <lgl~ NA NA NA NA
3 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
4 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
5 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
6 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
uj5u.com熱心網友回復:
library(tidyverse)
df <- tibble::tribble(
~ id,
~ country,
~ street_address,
~ status,
"2008-002231-32-GB",
c("United Kingdom", "Netherlands"),
list(c(NA, NA)),
list(c(NA, NA)),
"2020-001060-28-SE",
c("Denmark", "Denmark", "Denmark", "Denmark"),
c(
"Palle Juul Jensens Blvrd 67",
"Palle Juul Jensens Boulevard 99",
"Palle Juul Jensens Blvrd 67",
"Palle Juul Jensens Boulevard 99"
),
c(
"Non-Commercial",
"Non-Commercial",
"Non-Commercial",
"Non-Commercial"
)
)
df %>% mutate(res = map_chr(street_address, class)) %>%
group_split(res) %>%
map(~unnest(data = ., c(country, street_address, status))) %>%
map_df(~unnest(data = ., c(country, street_address, status))) %>%
select(-res)
#> # A tibble: 8 x 4
#> id country street_address status
#> <chr> <chr> <chr> <chr>
#> 1 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commerci~
#> 2 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commerci~
#> 3 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commerci~
#> 4 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commerci~
#> 5 2008-002231-32-GB United Kingdom <NA> <NA>
#> 6 2008-002231-32-GB United Kingdom <NA> <NA>
#> 7 2008-002231-32-GB Netherlands <NA> <NA>
#> 8 2008-002231-32-GB Netherlands <NA> <NA>
由reprex 包(v2.0.1)于 2021 年 12 月 14 日創建
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/380666.html
上一篇:在purrr::map中使用if陳述句來避免丟失資料錯誤
下一篇:替換多個字串中的多個單詞
