我想按 ID 過濾,其中 GRP 是 A 或 B 或 NA。如果一個 ID 包含 A B,則應洗掉 B。
資料:
DF<-tibble::tribble(
~ID, ~GRP,
1L, "A",
2L, "A",
2L, "B",
3L, "B",
3L, NA,
4L, "A",
4L, "A",
4L, NA
)
# A tibble: 8 × 2
ID GRP
<int> <chr>
1 1 A
2 2 A
3 2 B
4 3 B
5 3 NA
6 4 A
7 4 A
8 4 NA
期望的輸出:
A tibble: 7 × 2
ID GRP
<int> <chr>
1 1 A
2 2 A
3 3 B
4 3 NA
5 4 A
6 4 A
7 4 NA
最好的問候,H
uj5u.com熱心網友回復:
我們可以n_distinct在按“ID”分組后構造一個邏輯
library(dplyr)
DF %>%
group_by(ID) %>%
filter(all(c("A", "B") %in% GRP) &
GRP != "B"|
(is.na(GRP)|n_distinct(GRP, na.rm = TRUE) == 1)) %>%
ungroup
-輸出
# A tibble: 7 × 2
ID GRP
<int> <chr>
1 1 A
2 2 A
3 3 B
4 3 <NA>
5 4 A
6 4 A
7 4 <NA>
或使用first非 NA 元素創建邏輯
DF %>%
group_by(ID) %>%
filter(GRP %in% first(na.omit(GRP)) | is.na(GRP)) %>%
ungroup
# A tibble: 7 × 2
ID GRP
<int> <chr>
1 1 A
2 2 A
3 3 B
4 3 <NA>
5 4 A
6 4 A
7 4 <NA>
或者可以使用兩個 n_distinct
DF %>%
group_by(ID) %>%
filter(n_distinct(GRP, na.rm = TRUE) == 2 & GRP != "B"|
n_distinct(GRP, na.rm = TRUE) == 1) %>%
ungroup
或使用 base R
i1 <- with(DF, ID %in% names(which(rowSums(table(ID, GRP) > 0) == 2)))
subset(DF, GRP == 'A' & i1|!i1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/394687.html
標籤:r
