我正在嘗試在 R 中創建滾動平均計算,但我只有某個范圍內某些日期的資料。但是,不是在滾動平均計算中完全省略沒有資料的日期,我希望在滾動平均計算中沒有資料的日期被計為 0。
我創建了一個示例資料框來演示,但在我的實際資料集中,我顯然有更多的日期和名稱。此外,我正在嘗試找出一種方法,將名稱添加到為 0 的每一天添加的位置。
這是示例 DF 的代碼
Date <- Date <- as.Date(c('2021-01-01','2021-01-01','2021-01-01', '2021-01-03','2021-01-03','2021-01-03', '2021-01-04','2021-01-04','2021-01-04', '2021-02-02','2021-02-02','2021-02-02', '2021-02-03','2021-02-03','2021-02-03', '2021-03-01','2021-03-01','2021-03-01'))
Name <- c("John Smith", "Jane Peters", "Jim Clark","John Smith", "Jane Peters", "Jim Clark","John Smith", "Jane Peters", "Jim Clark","John Smith", "Jane Peters", "Jim Clark","John Smith", "Jane Peters", "Jim Clark","John Smith", "Jane Peters", "Jim Clark")
Hours <- c(floor(runif(18, min=0, max = 14)))
data.frame(Date, Name, Hours)
上面的資料框看起來像這樣
Date Name Hours
1 2021-01-01 John Smith 11
2 2021-01-01 Jane Peters 9
3 2021-01-01 Jim Clark 6
4 2021-01-03 John Smith 7
5 2021-01-03 Jane Peters 1
6 2021-01-03 Jim Clark 9
7 2021-01-04 John Smith 2
8 2021-01-04 Jane Peters 4
9 2021-01-04 Jim Clark 10
10 2021-02-02 John Smith 2
11 2021-02-02 Jane Peters 1
12 2021-02-02 Jim Clark 3
13 2021-02-03 John Smith 8
14 2021-02-03 Jane Peters 7
15 2021-02-03 Jim Clark 0
16 2021-03-01 John Smith 11
17 2021-03-01 Jane Peters 6
18 2021-03-01 Jim Clark 8
為了說明這一點,我想為沒有資料的每個人添加一天的“小時”為 0 的一天。
最終結果在 1 月的前幾天看起來像這樣,但會一直延續到 3 月初的結束日期。
1 2021-01-01 John Smith 11
2 2021-01-01 Jane Peters 9
3 2021-01-01 Jim Clark 6
4 2021-01-02 John Smith 0
5 2021-01-02 Jane Peters 0
6 2021-01-02 Jim Clark 0
4 2021-01-03 John Smith 7
5 2021-01-03 Jane Peters 1
6 2021-01-03 Jim Clark 9
7 2021-01-04 John Smith 2
8 2021-01-04 Jane Peters 4
9 2021-01-04 Jim Clark 10
10 2021-01-05 John Smith 0
11 2021-01-05 Jane Peters 0
12 2021-01-05 Jim Clark 0
這樣,我將能夠將沒有資料的天數添加為 0 到我的滾動平均值計算中。
uj5u.com熱心網友回復:
聽起來您正在尋找該tidyr::complete功能。看起來像這樣
library(tidyr)
complete(mydata, Date=seq(min(Date), max(Date), by="1 day"), Name, fill=list(Hours=0))
uj5u.com熱心網友回復:
只需使用基礎 R,您就可以merge將資料轉換為expand.grid.
res <- merge(dat,
expand.grid(
Date=seq(as.Date('2021-01-01'), as.Date('2021-03-01'), 'days'),
Name=unique(dat$Name)),
all=TRUE)
res <- transform(res, Hours=replace(Hours, is.na(Hours), 0))
head(res)
# Date Name Hours
# 1 2021-01-01 Jane Peters 9
# 2 2021-01-01 Jim Clark 6
# 3 2021-01-01 John Smith 11
# 4 2021-01-02 Jane Peters 0
# 5 2021-01-02 Jim Clark 0
# 6 2021-01-02 John Smith 0
資料
dat <- structure(list(Date = structure(c(18628, 18628, 18628, 18630,
18630, 18630, 18631, 18631, 18631, 18660, 18660, 18660, 18661,
18661, 18661, 18687, 18687, 18687), class = "Date"), Name = c("John Smith",
"Jane Peters", "Jim Clark", "John Smith", "Jane Peters", "Jim Clark",
"John Smith", "Jane Peters", "Jim Clark", "John Smith", "Jane Peters",
"Jim Clark", "John Smith", "Jane Peters", "Jim Clark", "John Smith",
"Jane Peters", "Jim Clark"), Hours = c(11L, 9L, 6L, 7L, 1L, 9L,
2L, 4L, 10L, 2L, 1L, 3L, 8L, 7L, 0L, 11L, 6L, 8L)), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18"), class = "data.frame")
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/352407.html
