我想計算從 INDX 到 DS、RE、SE 的曝光時間(天),以先到者為準。如果所有 (DS,RE,SE) 均為 NA,則應計算到固定日期 (2015-01-01) 的時間。
資料:
DF<-tibble::tribble(
~ID, ~INDX, ~DS, ~RE, ~SE,
1L, "2001-01-01", "2002-02-02", "2003-03-03", NA,
2L, "2002-02-02", NA, "2001-01-01", "2002-02-02",
3L, "2003-03-03", "2009-09-09", NA, "2010-10-10",
4L, "2001-01-01", NA, NA, NA
)
DF%>%mutate_at(vars(2,3,4,5), as.Date)
# A tibble: 4 × 5
ID INDX DS RE SE
<int> <date> <date> <date> <date>
1 1 2001-01-01 2002-02-02 2003-03-03 NA
2 2 2002-02-02 NA 2001-01-01 2002-02-02
3 3 2003-03-03 2009-09-09 NA 2010-10-10
4 4 2001-01-01 NA NA NA
>
期望的輸出:
# A tibble: 4 × 6
ID INDX DS RE SE TIME
<int> <date> <date> <date> <date> <int>
1 1 2001-01-01 2002-02-02 2003-03-03 NA
2 2 2002-02-02 NA 2001-01-01 2002-02-02
3 3 2003-03-03 2009-09-09 NA 2010-10-10
4 4 2001-01-01 NA NA NA
哪種方法最簡單?
問候, H
uj5u.com熱心網友回復:
newDF <- DF%>%mutate_at(vars(2,3,4,5), as.Date)
newDF %>%
mutate(time2use=pmin(DS, RE, SE, na.rm=T)) %>%
mutate(TIME = abs(INDX-time2use)) %>%
mutate(TIME=ifelse(is.na(TIME), abs(INDX-as.Date("2015-01-01")), TIME)) %>%
select(c(-6))
ID INDX DS RE SE TIME
<int> <date> <date> <date> <date> <dbl>
1 1 2001-01-01 2002-02-02 2003-03-03 NA 397
2 2 2002-02-02 NA 2001-01-01 2002-02-02 397
3 3 2003-03-03 2009-09-09 NA 2010-10-10 2382
4 4 2001-01-01 NA NA NA 5113
uj5u.com熱心網友回復:
一個選項是通過這樣的 ifelse 條件:
DF$TIME <- ifelse(!is.na(DF$DS), as.Date(DF$DS) - as.Date(DF$INDX),
ifelse(!is.na(DF$RE), as.Date(DF$RE) - as.Date(DF$INDX),
ifelse(!is.na(DF$SE), as.Date(DF$SE) - as.Date(DF$INDX), as.Date("2015-01-01") - as.Date(DF$INDX))))
輸出:
DF
# A tibble: 4 x 6
ID INDX DS RE SE TIME
<int> <chr> <chr> <chr> <chr> <dbl>
1 1 2001-01-01 2002-02-02 2003-03-03 NA 397
2 2 2002-02-02 NA 2001-01-01 2002-02-02 -397
3 3 2003-03-03 2009-09-09 NA 2010-10-10 2382
4 4 2001-01-01 NA NA NA 5113
uj5u.com熱心網友回復:
如果我理解正確,您可以嘗試以下操作。為方便起見,將條件放入case_when. 如果所有值都NA為DSthrough SE,則使用您選擇的日期并減去INDX。否則,使用na.omit洗掉與 收集的缺失值c_across,選擇該行中超過 的值INDX。取第一個結果(如果按時間順序是最小值),然后減去INDX。
library(tidyverse)
DF %>%
mutate(across(.cols = !ID, as.Date)) %>%
rowwise() %>%
mutate(TIME = case_when(
all(is.na(c_across(DS:SE))) ~ as.Date("2015-01-01") - INDX,
TRUE ~ na.omit(c_across(DS:SE))[na.omit(c_across(DS:SE)) >= INDX][1] - INDX))
輸出
ID INDX DS RE SE TIME
<int> <date> <date> <date> <date> <drtn>
1 1 2001-01-01 2002-02-02 2003-03-03 NA 397 days
2 2 2002-02-02 NA 2001-01-01 2002-02-02 0 days
3 3 2003-03-03 2009-09-09 NA 2010-10-10 2382 days
4 4 2001-01-01 NA NA NA 5113 days
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/397040.html
標籤:r
上一篇:從數值串列中,創建索引串列
下一篇:用舊資料更新新資料中的值
