我有一個包含日期和數字變數的資料庫。我每個 id 也有多行。它看起來像這樣:
| ID | 日期 | X |
|---|---|---|
| 1 | 2019-01-01 | 3 |
| 1 | 2018-12-01 | 4 |
| 1 | 2017-11-01 | 1 |
| 1 | 2017-10-01 | 2 |
| 1 | 2017-09-01 | 2 |
| 1 | 2017-08-01 | 2 |
我需要從日期到六個月前總結 x,所以我嘗試了這個
library(lubridate)
mutate(semester= semester(fecha_inicio,with_year = TRUE)) %>%
group_by(ID,semester) %>%
mutate(sum_semester = sum(x, na.rm = TRUE))
但不是我需要的,因為2019-01-01有 3 個而不是 14 個。
請幫忙。
uj5u.com熱心網友回復:
我在這里找到了從一個月前到當天的累積總和的答案,適用于所有 適應代碼的行:
library(tidyverse)
library(lubridate)
data <- data %>%
group_by(ID) %>%
mutate(sum_6m = map_dbl(1:n(), ~ sum(x[(date>= (date[.] - months(5))) &
(date<= date[.])], na.rm = TRUE)))
uj5u.com熱心網友回復:
使用經典的方法來聚合資料集 x ~ id購買一個像sum過濾器這樣的函式,它可能是下面的代碼。
代碼
library(lubridate)
# Define your data
data <-
"id date x
1 2019-01-01 3
1 2018-12-01 4
1 2017-11-01 1
1 2017-10-01 2
1 2018-02-12 2
1 2017-09-01 2
"
# Read the table
tab <- read.csv(text=data, header = T, sep=' ')
# Find the youngest date
top.date <- as.Date(max(tab$date))
# Calculate the threshold (before and after point) of 6 month
thresh <- top.date) %m-% months(6)
# Calculate the sums over the ID's after the point date
after.thresh <- aggregate(x ~ id,
data = tab[as.Date(tab$date) >= thresh,],
FUN = sum)
# Calculate the sums over the ID's before the point date
before.thresh <- aggregate(x ~ id,
data = tab[as.Date(tab$date) < thresh,],
FUN=sum)
# Print the dates
cat("TOP.DATE.IS:", format_ISO8601(top.date),
" THRESH.DATE.IS:", format_ISO8601(thresh),"\n")
# Print the sums
cat("SUM.BEFORE.THRESH:", after.thresh$x,
"SUM.AFTER.THRESH:", before.thresh$x,"\n")
結果
TOP.DATE.IS: 2019-01-01 THRESH.DATE.IS: 2018-07-01
SUM.BEFORE.THRESH: 7 SUM.AFTER.THRESH: 7
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/404924.html
標籤:
上一篇:R中縮進的物料清單乘法
