我想計算一個人是否從一年到下一年幸存下來。0 表示它死了,1 表示它存活了。資料集由不同年份(2007 年到 2020 年)組成,計算應從 2008 年開始。我只希望 R 使用我擁有的資料的一部分。
我的資料集如下所示:
我的資料集的前 17 行
> ID 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
3 0 1 1 1 0 0 0 0 0 0 0 0 0 0
4 0 1 1 1 0 0 0 0 0 0 0 0 0 0
9 0 1 0 0 0 0 0 0 0 0 0 0 0 0
24 0 0 1 1 1 1 1 1 1 1 1 1 1 0
...
我總共有 1,121 個條目,總共 16 個列。
我希望 R 從 2008 年的第一行開始,看看是否有 1。如果有 1,我希望 R 查看下一列(2009),看看是否還有 1(應該給我一個 1 作為輸出)或一個 0(應該給我一個 0 作為輸出)。如果沒有 1,我希望 R 檢查下一列,直到找到帶有 1 的年份,那么它應該如上所述檢查下一列。在它找到 1 并進行檢查后,它應該忽略剩余的列并移動到下一行并重復該程序。輸出應保存在新列中。
我嘗試了 for 回圈和 if else 陳述句以及 ifelse, if ...
我最接近目標的是使用以下代碼
for(x in foal_fates_2)) {
if (foal_fates_2$`2008`=="1" && foal_fates_2$`2009` =="1") {
print("1")
} else if (foal_fates_2$`2008`== "1" && foal_fates_2$`2009` =="0") {
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="1" && foal_fates_2$`2010` == "1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="1" && foal_fates_2$`2010`== "0") {
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="1" &&
foal_fates_2$`2011`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="1" &&
foal_fates_2$`2011`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="1" && foal_fates_2$`2012`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="1" && foal_fates_2$`2012`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="1" && foal_fates_2$`2013`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="1" && foal_fates_2$`2013`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="1" &&
foal_fates_2$`2014`== "1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="1" &&
foal_fates_2$`2014`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "1" && foal_fates_2$`2015`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "1" && foal_fates_2$`2015`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="1" && foal_fates_2$`2016` =="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="1" && foal_fates_2$`2016` =="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="1" &&
foal_fates_2$`2017`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="1" &&
foal_fates_2$`2017`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="1" && foal_fates_2$`2018`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="1" && foal_fates_2$`2018`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="1" && foal_fates_2$`2019`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="1" && foal_fates_2$`2019`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="0" && foal_fates_2$`2019`=="1" &&
foal_fates_2$`2020`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="0" && foal_fates_2$`2019`=="1" &&
foal_fates_2$`2020`=="0"){
print("0")
}
}
With this code R at least does something, and the result has the correct number of entities but the output is not correct. R gives me 0 and 1 but not at the correct place. Meaning e.g. for the first five rows R gave me the results "0" "0" "0" "1" "0" but it should be "0" "1" "1" "1" "0". At least if I understand it correctly. I am new to R so maybe for loop and if else are not the right tools for what I want to do. So, the question is how can I get to my goal. I would really appreciate any help.
uj5u.com熱心網友回復:
我會撰寫一個應用于每一行的函式。類似于以下內容(當然可以更詳細,但應該可以完成作業):
numberAfterFirstOne <- function(myRow){
x <- which(myRow == 1)[1]
if (length(x 1) < length(myRow)) #
return(myRow[x 1])
else
return(NA)
}
解釋:
- 哪些索引等于1,選擇第一個即可;如果沒有為 1,則 x 將為 NA。
- 如果在第一個值之后有一個值,則回傳它
- 回傳 NA(也可以是 0 或任何你想要的“鍵值”
為了測驗這里是一個示例資料集:
n <- 5
m <- 16
set.seed(1562) # for reproducability
dataset <- as.data.frame(matrix(ncol = m, nrow = n, data = round(runif(m * n, 0, 0.7))))
dataset <- rbind(dataset, rep(0, 16))
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16
1 1 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0
2 1 1 0 0 0 1 1 0 0 1 1 0 0 0 1 0
3 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1
4 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0
5 0 1 1 0 0 1 0 1 0 1 0 1 0 0 1 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
然后是每一行apply的函式numberAfterFirstOne(apply 類似于 for 回圈,但寫和讀更整潔)。
apply(dataset, 1, numberAfterFirstOne)
[1] 0 1 0 0 1 NA
這類似于帶有 for 回圈的更笨重的構造:
result <- c()
for (i in 1:nrow(dataset)){
result[i] <- numberAfterFirstOne(dataset[i, ])
}
You could now tweak the function to return what you want. At the moment there could be 0, 1, or NA returned, maybe you just want 1 and 0 or 1 and NA. The check with if (length(x 1)) would not be necessary, because if the index is out of bounce, NA is returned by myRow[x 1] which would make the function even simpler.
You could also modify the code, so that the year is also returned:
colnames(dataset) <- 2007:2020 # name the columns of the example dataset
numberAfterFirstOne <- function(myRow){
x <- which(myRow == 1)[1]
return(c(x, myRow[x 1])) # return the column index the value
}
result <- apply(dataset, 1, numberAfterFirstOne) #save the result
result[1, ] <- names(dataset)[result[1, ]] # set column index to name of dataset column
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2007" "2007" "2012" "2007" "2008" NA
[2,] "0" "1" "0" "0" "1" NA
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/437103.html
