我正在嘗試對變陣列、型別和年份進行分組。每個組、型別和年份都有一個特定的代碼,每年都會發生變化。我想創建一個名為“差異”的列,如果組和型別在一年中的代碼為 200,下一年為 210,則“差異”列會將其注冊為增加 10。
group <- c("A", "A", "A", "B", "B", "B", "C", "C", "C")
type <- c("small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large")
year <- c(1995, 1995, 1995, 1995, 1995, 1995, 1995, 1995, 1995,
1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996,
1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997)
code <- c(100, 100, 100, 200, 200, 200, 300, 300, 300,
150, 150, 100, 200, 200, 200, 350, 320, 300,
130, 170, 90, 210, 90, 80, 310, 300, 320)
df <- data.frame(group, type, year, code)
這是 df 的樣子:
group type year code
1 A small 1995 100
2 A medium 1995 100
3 A large 1995 100
4 B small 1995 200
5 B medium 1995 200
6 B large 1995 200
7 C small 1995 300
8 C medium 1995 300
9 C large 1995 300
10 A small 1996 150
11 A medium 1996 150
12 A large 1996 100
13 B small 1996 200
14 B medium 1996 200
15 B large 1996 200
16 C small 1996 350
17 C medium 1996 320
18 C large 1996 300
19 A small 1997 130
20 A medium 1997 170
21 A large 1997 90
22 B small 1997 210
23 B medium 1997 90
24 B large 1997 80
25 C small 1997 310
26 C medium 1997 300
27 C large 1997 320
我想要以下輸出:
group <- c("A", "A", "A", "B", "B", "B", "C", "C", "C")
type <- c("small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large", "small", "medium", "large")
year <- c(1995, 1995, 1995, 1995, 1995, 1995, 1995, 1995, 1995,
1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996,
1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997)
code <- c(100, 100, 100, 200, 200, 200, 300, 300, 300,
150, 150, 100, 200, 200, 200, 350, 320, 300,
130, 170, 90, 210, 90, 80, 310, 300, 320)
difference <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
50, 50, 0, 0, 0, 0, 50, 20, 0,
-20, 20, -10, 10, 110, 120, -40, -20, 0)
df2 <- data.frame(group, type, year, code, difference)
group type year code difference
1 A small 1995 100 NA
2 A medium 1995 100 NA
3 A large 1995 100 NA
4 B small 1995 200 NA
5 B medium 1995 200 NA
6 B large 1995 200 NA
7 C small 1995 300 NA
8 C medium 1995 300 NA
9 C large 1995 300 NA
10 A small 1996 150 50
11 A medium 1996 150 50
12 A large 1996 100 0
13 B small 1996 200 0
14 B medium 1996 200 0
15 B large 1996 200 0
16 C small 1996 350 50
17 C medium 1996 320 20
18 C large 1996 300 0
19 A small 1997 130 -20
20 A medium 1997 170 20
21 A large 1997 90 -10
22 B small 1997 210 10
23 B medium 1997 90 110
24 B large 1997 80 120
25 C small 1997 310 -40
26 C medium 1997 300 -20
27 C large 1997 320 0
這是我嘗試過的:
df3 <- df2 %>%
group_by(group, type, year) %>%
mutate(difference = code - lag(code))
問題是滯后似乎沒有考慮分組,而只是從它之前的行中減去。有什么建議?
uj5u.com熱心網友回復:
正如@IRTFM 已經指出的那樣。僅按group和分組type。它提供幾乎相同的輸出。注意最后一行是不同的。
library(dplyr)
df %>%
group_by(group, type) %>%
mutate(difference= code - lag(code)) %>%
data.frame()
group type year code difference
1 A small 1995 100 NA
2 A medium 1995 100 NA
3 A large 1995 100 NA
4 B small 1995 200 NA
5 B medium 1995 200 NA
6 B large 1995 200 NA
7 C small 1995 300 NA
8 C medium 1995 300 NA
9 C large 1995 300 NA
10 A small 1996 150 50
11 A medium 1996 150 50
12 A large 1996 100 0
13 B small 1996 200 0
14 B medium 1996 200 0
15 B large 1996 200 0
16 C small 1996 350 50
17 C medium 1996 320 20
18 C large 1996 300 0
19 A small 1997 130 -20
20 A medium 1997 170 20
21 A large 1997 90 -10
22 B small 1997 210 10
23 B medium 1997 90 -110
24 B large 1997 80 -120
25 C small 1997 310 -40
26 C medium 1997 300 -20
27 C large 1997 320 20
uj5u.com熱心網友回復:
你可以diff在ave.
dat[order(dat$group, dat$type), ] |>
transform(diff=ave(code, group, type, FUN=\(x) c(NA, diff(x)))) |>
(\(x) x[order(as.numeric(rownames(x))), ])() ## optional, to reorder rows
# group type year code diff
# 1 A small 1995 100 NA
# 2 A medium 1995 100 NA
# 3 A large 1995 100 NA
# 4 B small 1995 200 NA
# 5 B medium 1995 200 NA
# 6 B large 1995 200 NA
# 7 C small 1995 300 NA
# 8 C medium 1995 300 NA
# 9 C large 1995 300 NA
# 10 A small 1996 150 50
# 11 A medium 1996 150 50
# 12 A large 1996 100 0
# 13 B small 1996 200 0
# 14 B medium 1996 200 0
# 15 B large 1996 200 0
# 16 C small 1996 350 50
# 17 C medium 1996 320 20
# 18 C large 1996 300 0
# 19 A small 1997 130 -20
# 20 A medium 1997 170 20
# 21 A large 1997 90 -10
# 22 B small 1997 210 10
# 23 B medium 1997 90 -110
# 24 B large 1997 80 -120
# 25 C small 1997 310 -40
# 26 C medium 1997 300 -20
# 27 C large 1997 320 20
注: R >= 4.1 使用
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/394664.html
上一篇:在R中用亂數替換字符
