我在這里有一個 df (所需的輸出,我的起始 df 沒有 Flag 變數):
df <- data.frame(
Person = c('1','2','3'),
Date = as.Date(c('2010-09-30', '2012-11-20', '2015-03-11')),
Treatment_1 = as.Date(c('2010-09-30', '2012-11-21', '2015-03-22')),
Treatment_2 = as.Date(c('2011-09-30', 'NA', '2011-03-22')),
Treatment_3 = as.Date(c('2012-09-30', '2015-11-21', '2015-06-22')),
Surgery_1 = as.Date(c(NA, '2016-11-21', '2015-03-12')),
Surgery_2 = as.Date(c(NA, '2017-11-21', '2019-03-12')),
Surgery_3 = as.Date(c(NA, '2018-11-21', '2013-03-12')),
Flag = c('', 'Y', '')
)
我想根據這些條件推匯出 Flag 變數:
- 對于任何以 Treatment 開頭的列,如果 Date = Treatment 則將 Flag 設定為“”
- 對于以手術開頭的任何列,如果日期 = 手術 OR 日期 = 手術 1 或日期 = 手術 - 1 (基本上如果手術日期在日期當天、前一天或后一天),則將標志設定為“”變數,將標志設定為“”)。
- 否則設定標志=“Y”
我研究了 mutate_at ,但它重寫了變數并分配了 True/False 的值。
這是錯誤的,但這是我的嘗試:
df2 <- df %>%
mutate(Flag = case_when(
vars(starts_with("Treatment"), Date == . ) ~ '',
vars(starts_with("Surgery"), Date == . | Date == . - 1 | Date == . 1) ~ '',
TRUE ~ 'Y')
)
uj5u.com熱心網友回復:
我們可以對中的每個條件使用rowwiseand 。然后,我們可以為匹配的(和 1、-1 天)列出一個串列。c_acrossanycase_whenDateSurgery
library(tidyverse)
df %>%
rowwise() %>%
mutate(Flag = case_when(
any(c_across(starts_with("Treatment")) == Date) ~ "",
any(c_across(starts_with("Surgery")) %in% c(Date, (Date 1), (Date-1))) ~ "",
TRUE ~ "Y"
))
輸出
Person Date Treatment_1 Treatment_2 Treatment_3 Surgery_1 Surgery_2 Surgery_3 Flag
<chr> <date> <date> <date> <date> <date> <date> <date> <chr>
1 1 2010-09-30 2010-09-30 2011-09-30 2012-09-30 NA NA NA ""
2 2 2012-11-20 2012-11-21 NA 2015-11-21 2016-11-21 2017-11-21 2018-11-21 "Y"
3 3 2015-03-11 2015-03-22 2011-03-22 2015-06-22 2015-03-12 2019-03-12 2013-03-12 ""
uj5u.com熱心網友回復:
我喜歡 Andrew 的方法,但是當他的回答進來時我正在處理這個問題,所以如果你有興趣的話
df %>% inner_join(
pivot_longer(df, cols=Treatment_1:Surgery_3) %>%
mutate(flag=case_when(
(str_starts(name,"T") & value==Date) | (str_starts(name,"S") & abs(value-Date)<=1) ~ "",
TRUE ~"Y")) %>%
group_by(Person) %>%
summarize(flag = min(flag))
)
輸出:
Person Date Treatment_1 Treatment_2 Treatment_3 Surgery_1 Surgery_2 Surgery_3 flag
1 1 2010-09-30 2010-09-30 2011-09-30 2012-09-30 <NA> <NA> <NA>
2 2 2012-11-20 2012-11-21 <NA> 2015-11-21 2016-11-21 2017-11-21 2018-11-21 Y
3 3 2015-03-11 2015-03-22 2011-03-22 2015-06-22 2015-03-12 2019-03-12 2013-03-12
uj5u.com熱心網友回復:
這是另一種使用across方法:
library(tidyverse)
df %>%
mutate(across(starts_with("Treatment"), ~as.numeric(. %in% Date), .names ="new_{.col}"),
across(starts_with("Surgery"), ~as.numeric(. %in% c(Date, Date 1, Date-1)), .names ="new_{.col}")) %>%
mutate(Flag = ifelse(rowSums(select(., contains('new')))==1, "", "Y"), .keep="used") %>%
bind_cols(df)
Flag Person Date Treatment_1 Treatment_2 Treatment_3 Surgery_1 Surgery_2 Surgery_3
1 1 2010-09-30 2010-09-30 2011-09-30 2012-09-30 <NA> <NA> <NA>
2 Y 2 2012-11-20 2012-11-21 <NA> 2015-11-21 2016-11-21 2017-11-21 2018-11-21
3 3 2015-03-11 2015-03-22 2011-03-22 2015-06-22 2015-03-12 2019-03-12 2013-03-12
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/426668.html
