在我的玩具data下面,我不停地重復group_by()和filter()變數:sample,group和outcome(但不是time)。
我不知道是否有一個實用的解決方案,使得我們可以提供任何數量的我們想要的變數的名稱group_by(),并filter()在函式內部回圈式的方式類似foo()如下所示?
library(tidyverse)
data <- expand_grid(study=1:3,sample=1:2,group=1:3,outcome=c("A","B"),time=0:2)
get_rows <- function(x) { # Helper function used in `filter()`
u <- unique(x)
n <- sample(c(if(is.character(x)) 0 else min(u)-1, u), 1)
if(n == n[1]) TRUE else x == n
}
DF <- data %>%
group_by(study) %>%
filter(get_rows(sample)) %>% # for sample
ungroup()
DF2 <- DF %>%
group_by(study) %>%
filter(get_rows(group)) %>% # for group
ungroup()
DF3 <- DF2 %>%
group_by(study) %>%
filter(get_rows(outcome)) %>% # for outcome
ungroup()
#============================================ HOW TO LOOP ABOVE IN `foo()` BELOW?
foo <- function(data, ..., exclude_vars = c("time")){
## SOLUTION
}
uj5u.com熱心網友回復:
如果使用 dplyr.data代詞,則可以在字串中回圈變數的名稱。例如
foo <- function(data, exclude_vars = c("time", "study")){
vars <- setdiff(names(data), exclude_vars)
for (var in vars) {
data <- data %>%
group_by(study) %>%
filter(get_rows(.data[[var]])) %>%
ungroup()
}
data
}
foo(data)
如果你愿意,你可以使用purrr::reduce而不是回圈
foo <- function(data, exclude_vars = c("time", "study")){
vars <- setdiff(names(data), exclude_vars)
cleanFn <- function(data, var) data %>%
group_by(study) %>%
filter(get_rows(.data[[var]])) %>%
ungroup()
reduce(vars, cleanFn, .init=data)
}
foo(data)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/339611.html
