下面的 MWE 代碼按預期作業。總之:
- 第一個
data1 <- ...mutate(...)添加一個新列“minusD”,計算為(i)當前行“plusB”值 (ii)前一行“PlusB”值,如果從一行移動到下一行時id相同(否則為0) , 和 - 第二個
data1 <- ...mutate(...)添加了一個“running_balance”列,它cumsum()為所有共享相同 id 的行計算 a 。
但是,當在更完整的代碼中部署它時,由于運行兩個data1 <- ...行程,我在運行另一個表時出現錯誤,該表從這個“data1”資料幀中提取。那么,如何將這兩個功能合二為一呢?
帶有計算的輸出解釋:
id plusA plusB minusC minusD running_balance [explain calculations ...]
1 3 5 10 5 -7 minus D = plusB, running bal = plusA plusB - minusC - minusD
2 4 5 9 5 -5 same formulas as above since id <> prior row id
3 8 5 8 5 0 same formulas as above since id <> prior row id
3 1 4 7 9 -11 since id = prior row id, minus D = plusB prior row plus B, and running bal = running bal from prior row plusA plusB - minusC - minusD
3 2 5 6 9 -19 same formulas as above since id = prior row id
5 3 6 5 6 -2 minus D = plusB, running bal = plusA plusB - minusC - minusD
MWE代碼:
data <- data.frame(id=c(1,2,3,3,3,5),
plusA=c(3,4,8,1,2,3),
plusB=c(5,5,5,4,5,6),
minusC = c(10,9,8,7,6,5))
library(dplyr)
data1<- subset(
data %>% mutate(extra=case_when(id==lag(id)~lag(plusB),TRUE ~ 0)) %>%
mutate(minusD=plusB extra),
select = -c(extra) # remove temporary calculation column
)
data1 <- data1 %>% group_by(id) %>% mutate(running_balance = cumsum(plusA plusB - minusC - minusD))
uj5u.com熱心網友回復:
您可以繼續鏈%>%而不是創建臨時物件。
library(dplyr)
data %>%
mutate(extra=case_when(id==lag(id)~lag(plusB),TRUE ~ 0),
minusD=plusB extra) %>%
group_by(id) %>%
mutate(running_balance = cumsum(plusA plusB - minusC - minusD)) %>%
ungroup %>%
select(-extra)
# id plusA plusB minusC minusD running_balance
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 3 5 10 5 -7
#2 2 4 5 9 5 -5
#3 3 8 5 8 5 0
#4 3 1 4 7 9 -11
#5 3 2 5 6 9 -19
#6 5 3 6 5 6 -2
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/388793.html
上一篇:洗掉資料框中列的每個因素的例外值
