將每行的值寫入每行x天前-有解無憂

我有以下資料集：

data.frame(id=c(1,1,1,1,1,2,2,2,2), 
                   date = as.Date(c("2020-01-01","2020-01-04","2020-01-06","2020-01-07","2020-01-10","2020-01-01","2020-01-02","2020-01-04","2020-01-05")),
                   duration = c(2,3,4,2,4,3,4,2,2),
                   product = c("A","B","C","A","C","B","C","A","A"))

我有一個人的 id，他們每天使用什么產品以及產品將持續多久（持續時間） - 更新：這個樣本中的產品確實有一個設定的持續時間，但實際上它不需要是案件。

我需要為每一行列出每個人當前使用的產品串列，因此生成的資料集應如下所示（此處的分隔符為“|”，但無關緊要）：

data.frame(id=c(1,1,1,1,1,2,2,2,2), 
           date = as.Date(c("2020-01-01","2020-01-04","2020-01-06","2020-01-07","2020-01-10","2020-01-01","2020-01-02","2020-01-04","2020-01-05")),
           duration = c(2,3,4,2,4,3,4,2,2),
           product = c("A","B","C","A","C","B","C","A","A"),
           products_in_use = c("A","B","B | C", "A | B | C", "C", "B", "B | C", "A | B | C", "A | C"))

基本上我想我需要從當前行中獲取持續時間（如更少或相等的天數）內的所有行，并將當前產品附加到他們的串列中。然后我會采用串列的唯一且有序的版本，并將其作為字串寫入。但我不知道如何做第一步。

如果所有這些都可以在 dplyr 管道內作業，那將是首選。

uj5u.com熱心網友回復：

我看不到一個容易完全在這樣的方式dplyr，因為它依賴于檢查日期和時間的總和每個在日行每行，但如果你先定義此功能：

get_products_in_use <- function(dates, durations, products)
{
  apply(sapply(seq_along(dates), 
         function(i) {
           ifelse(test = dates >= dates[i] & dates <= dates[i]   durations[i], 
                  yes  = products[i], 
                  no   = "")
           }),
      1, function(x) paste(unique(sort(x[nzchar(x)])), collapse = " | "))
}

然后它很容易在dplyr管道中使用：

testdata %>% 
  group_by(id) %>% 
  mutate(products_in_use = get_products_in_use(date, duration, product))
#> # A tibble: 9 x 5
#> # Groups:   id [2]
#>      id date       duration product products_in_use
#>   <dbl> <date>        <dbl> <chr>   <chr>          
#> 1     1 2020-01-01        2 A       A              
#> 2     1 2020-01-04        3 B       B              
#> 3     1 2020-01-06        4 C       B | C          
#> 4     1 2020-01-07        2 A       A | B | C      
#> 5     1 2020-01-10        4 C       C              
#> 6     2 2020-01-01        3 B       B              
#> 7     2 2020-01-02        4 C       B | C          
#> 8     2 2020-01-04        2 A       A | B | C      
#> 9     2 2020-01-05        2 A       A | C

^{由reprex 包( v2.0.0 )于 2021 年 11 月 9 日創建}

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/354152.html

標籤：r

上一篇：如何修復使用flextable構建表格時出現的tidy錯誤

下一篇：在R中創建一個三向列聯表