根據條件創建新變數-有解無憂

我想知道是否值的列的改變type 第二行后和更改數量列type第二行之后。例如在 id ==1, type==1e,1e,2d,2h 中。然后當我們處理第二行型別從 1e 到 2d 然后是 2h 時。因此有一個變化，變化的數量是 2。

data<- data.frame(id= c(1, 1, 1, 1,  2, 2, 2, 2,  2, 2, 3, 3, 3,3 ,3,3,4,4,4, 5,5), 
                   type=c("1e","1e","2d","2h","1c","1c","1e","2d","2h","2j","1e",
                          "1e","2e","1e","1e","2h","1c","1c","1c", "1j","1j"))

期望輸出：

 id type change_of_type_after_2nd_row count 
   1    1e   NA     2
   1    1e   NA     2 
   1    2d   yes    2
   1    2h   yes    2
   2    1c   NA     4 
   2    1c   NA     4
   2    1e   yes    4
   2    2d   yes    4
   2    2h   yes    4
   2    2j   yes    4
   3    1e   NA     3
   3    1e   NA     3
   3    2e   yes    3
   3    1e   yes    3
   3    1e   No     3
   3    2h   yes    3
   4    1c   NA     0
   4    1c   NA     0
   4    1c   No     0
   5    1j   NA     0
   5    1j   NA     0

請幫忙？

uj5u.com熱心網友回復：

按 'id' 分組，創建一個具有rleid'type'的 run-length-id ( ) 和一個序列列 ('rn')的 'new' 列，然后通過獲取 ' 的不同元素的數量來獲取 'count' new' 小于 1 ('count') 和 'change_of_type..'，條件是case_when基于duplicated'new' 中的行號和值創建的

library(dplyr)
library(data.table)
out <- data %>%
     group_by(id) %>% 
     mutate(new = rleid(type), rn = row_number(), 
        count = n_distinct(new)-1,
      change_of_type_after_2nd_row = case_when(rn >2 & 
          duplicated(new) ~ 'No', rn > 2 ~ 'Yes')) %>%
     ungroup %>% 
     select(-new)

-輸出

as.data.frame(out)
  id type rn count change_of_type_after_2nd_row
1   1   1e  1     2                         <NA>
2   1   1e  2     2                         <NA>
3   1   2d  3     2                          Yes
4   1   2h  4     2                          Yes
5   2   1c  1     4                         <NA>
6   2   1c  2     4                         <NA>
7   2   1e  3     4                          Yes
8   2   2d  4     4                          Yes
9   2   2h  5     4                          Yes
10  2   2j  6     4                          Yes
11  3   1e  1     3                         <NA>
12  3   1e  2     3                         <NA>
13  3   2e  3     3                          Yes
14  3   1e  4     3                          Yes
15  3   1e  5     3                           No
16  3   2h  6     3                          Yes
17  4   1c  1     0                         <NA>
18  4   1c  2     0                         <NA>
19  4   1c  3     0                           No
20  5   1j  1     0                         <NA>
21  5   1j  2     0                         <NA>

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/327579.html

標籤：r dplyr

上一篇：如何在R資料幀中矢量化具有多個可能輸出的函式

下一篇：修復ggplot繪圖視窗中文本框的位置