我有
df = structure(list(`Q4-21` = c(0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
1L, 0L, 1L, 0L, 1L, 0L, 1L), `Q1-22` = c(0L, 0L, 1L, 1L, 0L,
0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L), `Q2-22` = c(0L,
0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L),
`Q3-22` = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), Name = c("A", "B", "C", "D", "E", "F", "G",
"H", "I", "J", "K", "L", "M", "N", "O", "P")), row.names = c(NA,
-16L), class = "data.frame")
我想
- 過濾第 3,4 和 5 列不同時為 0 的位置
- 按位置參考每一列,因為列名會反復變化
我努力了:
cols_of_interest = colnames(df)[2:4]
df %>%
filter_at(which(colnames(df) %in% cols_of_interest), all_vars(. !=0))
但是這個過濾器分別在每一列上不為 0
我需要過濾掉行A
和B
我知道df[df[,2] !=0 | df[,3] !=0 | df[,4]!= 0,]
,但更喜歡 tidyverse 方法
有什么建議么?
uj5u.com熱心網友回復:
使用if_all
:
library(dplyr)
df %>%
filter(!if_all(2:4, ~ .x == 0))
uj5u.com熱心網友回復:
使用rowSums
.
df[rowSums(df[2:4]) != 0, ]
或者
subset(df, rowSums(df[2:4]) != 0)
# Q4-21 Q1-22 Q2-22 Q3-22 Name
# 3 0 1 0 0 C
# 4 1 1 0 0 D
# 5 0 0 1 0 E
# 6 1 0 1 0 F
# 7 0 1 1 0 G
# 8 1 1 1 0 H
# 9 0 0 0 1 I
# 10 1 0 0 1 J
# 11 0 1 0 1 K
# 12 1 1 0 1 L
# 13 0 0 1 1 M
# 14 1 0 1 1 N
# 15 0 1 1 1 O
# 16 1 1 1 1 P
uj5u.com熱心網友回復:
一種可能的方法是計算這 3 列的總和,然后使用以下代碼過濾總和大于 0 的行:
# in a single line of code
filter(df, rowSums(df[,cols_of_interest]) > 0)
相同,但在幾行中并使用 apply(跟蹤為過濾掉創建的 col')=>
df$sum_of_3_cols = apply(df[,cols_of_interest],
MARGIN = 1, FUN = sum, na.rm = T)
# ↑ compute a sum of these 3 col of interest
df %>% filter(sum_of_3_cols > 0 ) %>% select(-sum_of_3_cols)
# ↑ filter out ↑ remove the column used to filter
uj5u.com熱心網友回復:
您可以使用pmax
:
library(dplyr)
df %>%
filter(do.call(pmax, .[2:4])>0)
Q4-21 Q1-22 Q2-22 Q3-22 Name
1 0 1 0 0 C
2 1 1 0 0 D
3 0 0 1 0 E
4 1 0 1 0 F
5 0 1 1 0 G
6 1 1 1 0 H
7 0 0 0 1 I
8 1 0 0 1 J
9 0 1 0 1 K
10 1 1 0 1 L
11 0 0 1 1 M
12 1 0 1 1 N
13 0 1 1 1 O
14 1 1 1 1 P
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/522776.html
標籤:r
上一篇:針對大資料集的相關性測驗