如何按r中的時間閾值計算回應值的數量-有解無憂

我有一個學生資料集，其中包括對正確或錯誤問題的回答。還有一個以秒為單位的時間變數。我想創建一個時間標志來記錄正確和錯誤回應的數量1 minute 2 minute和3 minute閾值。這是一個示例資料集。

df <- data.frame(id = c(1,2,3,4,5),
                 gender = c("m","f","m","f","m"),
                 age = c(11,12,12,13,14),
                 i1 = c(1,0,NA,1,0),
                 i2 = c(0,1,0,"1]",1),
                 i3 = c("1]",1,"1]",0,"0]"),
                 i4 = c(0,"0]",1,1,0),
                 i5 = c(1,1,NA,"0]","1]"),
                 i6 = c(0,0,"0]",1,1),
                 i7 = c(1,"1]",1,0,0),
                 i8 = c(0,0,0,"1]","1]"),
                 i9 = c(1,1,1,0,NA),
                 time = c(115,138,148,195, 225))


 > df
  id gender age i1 i2 i3 i4   i5 i6 i7 i8 i9 time
1  1      m  11  1  0 1]  0    1  0  1  0  1  115
2  2      f  12  0  1  1 0]    1  0 1]  0  1  138
3  3      m  12 NA  0 1]  1 <NA> 0]  1  0  1  148
4  4      f  13  1 1]  0  1   0]  1  0 1]  0  195
5  5      m  14  0  1 0]  0   1]  1  0 1] NA  225

分鐘閾值由]分數右側的符號表示。

例如對于id = 3，1-minute閾值在 item i3，2-minute閾值在 item i6。每個學生可能有不同的時間閾值。

我需要創建標記變數以按閾值計算正確1-min 2-min和錯誤回應的數量。3-min

我怎樣才能獲得所需的資料集，如下所示。

> df1
  id gender age i1 i2 i3 i4   i5 i6 i7 i8 i9 time one_true one_false two_true two_false three_true three_false
1  1      m  11  1  0 1]  0    1  0  1  0  1  115        2         1       NA        NA         NA          NA
2  2      f  12  0  1  1 0]    1  0 1]  0  1  138        2         2        4         3         NA          NA
3  3      m  12 NA  0 1]  1 <NA> 0]  1  0  1  148        1         1        2         2         NA          NA
4  4      f  13  1 1]  0  1   0]  1  0 1]  0  195        2         0        3         2          5           3
5  5      m  14  0  1 0]  0   1]  1  0 1] NA  225        1         2        2         3          4           4

uj5u.com熱心網友回復：

圖書館（tidyverse）

df %>%
  pivot_longer(i1:i9,values_transform = as.character) %>%
  group_by(id)%>%
  mutate(vs = rev(cumsum(replace_na(str_detect(rev(value),']'),0))))%>%
  filter(vs > 0)%>%
  mutate(vs = max(vs) - vs   1)%>%
  group_by(vs,.add = TRUE)%>%
  summarise(true = sum(str_detect(value, '1'), na.rm = TRUE),
            false =  sum(str_detect(value, '0'), na.rm = TRUE),
            .groups = "drop_last")%>%
  mutate(across(c(true, false),cumsum)) %>%
  pivot_wider(id, names_from = vs, values_from = c(true, false))

# A tibble: 5 x 7
# Groups:   id [5]
     id true_1 true_2 true_3 false_1 false_2 false_3
  <dbl>  <int>  <int>  <int>   <int>   <int>   <int>
1     1      2     NA     NA       1      NA      NA
2     2      2      4     NA       2       3      NA
3     3      1      2     NA       1       2      NA
4     4      2      3      5       0       2       3
5     5      1      2      4       2       3       4

uj5u.com熱心網友回復：

您也可以在基礎 R 中完成相同的操作：

fun <- function(x){
  a <- diff(c(0,which(grepl("]", x))))
  f_sum <- function(x,y) sum(na.omit(grepl(x,y)))
  fn <- function(x) c(true = f_sum('1',x), false = f_sum('0',x))
  y <- tapply(x[seq(sum(a))], rep(seq_along(a),a), fn)
  s <- do.call(rbind, Reduce(" ", y, accumulate = TRUE))
  nms <- do.call(paste, c(sep='_',expand.grid(colnames(s), seq(nrow(s)))))
  setNames(c(t(s)), nms)
}

fun2 <- function(x){
  ln <- lengths(x)
  nms <- names(x[[which.max(ln)]])
  do.call(rbind, lapply(x, function(x)setNames(`length<-`(x,max(ln)),nms)))
}


fun2(apply(df[4:12],1,fun))
     true_1 false_1 true_2 false_2 true_3 false_3
[1,]      2       1     NA      NA     NA      NA
[2,]      2       2      4       3     NA      NA
[3,]      1       1      2       2     NA      NA
[4,]      2       0      3       2      5       3
[5,]      1       2      2       3      4       4

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/529063.html

標籤：r数数

上一篇：運行cor()時找不到物件

下一篇：將資料分解為更友好的DataFrame