我想根據兩個變數(國家和政黨)的組合生成一個組 ID。這是我的資料:
df <- data.frame(country = c("BE", "BE", "BE", "NL", "NL", "NL"),
year = c(2010, 2010, 2010, 2010, 2010, 2010),
party = c(NA, NA, NA, "A", "B", "B"))
這使:
country year party
1 BE 2010 <NA>
2 BE 2010 <NA>
3 BE 2010 <NA>
4 NL 2010 A
5 NL 2010 B
6 NL 2010 B
我想要的是:
country year party group
<chr> <dbl> <chr> <int>
1 BE 2010 NA NA
2 BE 2010 NA NA
3 BE 2010 NA NA
4 NL 2010 A 1
5 NL 2010 B 2
6 NL 2010 B 2
我試過:
df <- df %>%
group_by(country, party) %>%
mutate(group = cur_group_id())
但這給了我:
country year party group
<chr> <dbl> <chr> <int>
1 BE 2010 NA 1
2 BE 2010 NA 1
3 BE 2010 NA 1
4 NL 2010 A 2
5 NL 2010 B 3
6 NL 2010 B 3
但是,我不希望為任何具有缺失值的資料單獨分組。同時,我想保留資料。
如果我嘗試:
df <- df %>%
group_by(country, party) %>%
filter(!is.na(party)) %>%
mutate(group = cur_group_id())
我得到:
country year party group
<chr> <dbl> <chr> <int>
1 NL 2010 A 1
2 NL 2010 B 2
3 NL 2010 B 2
如何僅針對完整資料獲取此新變數,同時將不完整資料保留在資料集中?
謝謝
uj5u.com熱心網友回復:
使用互動
df %>% mutate(group = as.integer(interaction(country, party, drop = TRUE)))
給予:
country year party group
1 BE 2010 <NA> NA
2 BE 2010 <NA> NA
3 BE 2010 <NA> NA
4 NL 2010 A 1
5 NL 2010 B 2
6 NL 2010 B 2
uj5u.com熱心網友回復:
df <- data.frame(country = c("BE", "BE", "BE", "NL", "NL", "NL"),
year = c(2010, 2010, 2010, 2010, 2010, 2010),
party = c(NA, NA, NA, "A", "B", "B"))
library(data.table)
setDT(df)[!is.na(party), grp := .GRP, by = party][]
#> country year party grp
#> 1: BE 2010 <NA> NA
#> 2: BE 2010 <NA> NA
#> 3: BE 2010 <NA> NA
#> 4: NL 2010 A 1
#> 5: NL 2010 B 2
#> 6: NL 2010 B 2
由reprex 包(v2.0.1)于 2021 年 12 月 21 日創建
uj5u.com熱心網友回復:
像下面這樣的?
library(tidyverse)
df <- data.frame(country = c("BE", "BE", "BE", "NL", "NL", "NL"),
year = c(2010, 2010, 2010, 2010, 2010, 2010),
party = c(NA, NA, NA, "A", "B", "B"))
df %>%
group_by(country, party) %>%
mutate(group = if_else(is.na(party), NA_integer_, cur_group_id()))
#> # A tibble: 6 × 4
#> # Groups: country, party [3]
#> country year party group
#> <chr> <dbl> <chr> <int>
#> 1 BE 2010 <NA> NA
#> 2 BE 2010 <NA> NA
#> 3 BE 2010 <NA> NA
#> 4 NL 2010 A 2
#> 5 NL 2010 B 3
#> 6 NL 2010 B 3
如果您希望組以 1(而不是 2)開頭:
library(tidyverse)
df %>%
filter(!is.na(party)) %>%
group_by(country, party) %>%
mutate(group = cur_group_id()) %>%
ungroup %>% add_row(filter(df,is.na(party))) %>%
mutate(group = if_else(is.na(party), NA_integer_, group))
#> # A tibble: 6 × 4
#> country year party group
#> <chr> <dbl> <chr> <int>
#> 1 NL 2010 A 1
#> 2 NL 2010 B 2
#> 3 NL 2010 B 2
#> 4 BE 2010 <NA> NA
#> 5 BE 2010 <NA> NA
#> 6 BE 2010 <NA> NA
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/388794.html
上一篇:在R中使用dplyr時,如何合并對同一物件運行的2個單獨的mutate陳述句?
下一篇:如何在R中匯總和傳播資料
