選擇與向量匹配的列并使用其內容創建ifelse條件-有解無憂

我有一個包含多種疾病的資料集，0 表示沒有疾病，1 表示有疾病。

舉個例子來說明：我對疾病 A 以及資料集中的人是否有這種疾病本身或作為另一種疾病的原因感興趣。因此，我想創建一個新變數“Type”，其值為“NotDiseasedWithA”、“Primary”和“Secondary”。可能導致 A 的疾病包含在向量“SecondaryCauses”中：

SecondaryCauses = c("DiseaseB", "DiseaseD")

“NotDiseasedWithA”表示他們沒有疾病 A。“主要”表示他們患有疾病 A，但沒有任何可能導致該疾病的已知疾病。“次要”意味著他們患有疾病 A 和可能導致該疾病的疾病。

樣本資料

ID  DiseaseA    DiseaseB    DiseaseC    DiseaseD    DiseaseE
1   0           1           0           0           0
2   1           0           0           0           1
3   1           0           1           1           0
4   1           0           1           1           1
5   0           0           0           0           0

我的問題是：

如何選擇我感興趣的列？我有 20 多列未排序。因此我創建了向量。
如何根據我感興趣的疾病的內容創建條件？

我嘗試了類似以下的方法，但這不起作用：

DF %>% mutate(Type = ifelse(DiseaseA == 0, "NotDiseasedWithA", ifelse(sum(names(DF) %in% SecondaryCauses) > 0, "Secondary", "Primary")))

所以最后我想得到這個結果：

ID  DiseaseA    DiseaseB    DiseaseC    DiseaseD    DiseaseE    Type
1   0           1           0           0           0           NotDiseasedWithA
2   1           0           0           0           1           Primary
3   1           0           1           1           0           Secondary
4   1           0           1           1           1           Secondary
5   0           0           0           0           0           NotDiseasedWithA

uj5u.com熱心網友回復：

使用資料表

df <- structure(list(ID = 1:5, DiseaseA = c(0L, 1L, 1L, 1L, 0L), DiseaseB = c(1L, 
0L, 0L, 0L, 0L), DiseaseC = c(0L, 0L, 1L, 1L, 0L), DiseaseD = c(0L, 
0L, 1L, 1L, 0L), DiseaseE = c(0L, 1L, 0L, 1L, 0L)), row.names = c(NA, 
-5L), class = c("data.frame"))

library(data.table)

setDT(df) # make it a data.table

SecondaryCauses = c("DiseaseB", "DiseaseD")

df[DiseaseA == 0, Type := "NotDiseasedWithA"][DiseaseA == 1, Type := ifelse(rowSums(.SD) > 0, "Secondary", "Primary"), .SDcols = SecondaryCauses]

df

#    ID DiseaseA DiseaseB DiseaseC DiseaseD DiseaseE             Type
# 1:  1        0        1        0        0        0 NotDiseasedWithA
# 2:  2        1        0        0        0        1          Primary
# 3:  3        1        0        1        1        0        Secondary
# 4:  4        1        0        1        1        1        Secondary
# 5:  5        0        0        0        0        0 NotDiseasedWithA

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/427557.html

標籤：r if 语句 dplyr

上一篇：在C 中創建具有條件的矩陣

下一篇：如果rownames包含R中的特定單詞，則將值設定為0