我有一個簡單的資料框,如下所示:
ID, Type, a, b, c, d, e, f, etc.
ob1, 1, 1, 2, 3, 4, 5, 6, etc.
ob1, 2, 3, 4, 5, 6, 7, 1, etc.
我需要將每 3 列的值相加,以生成具有總和值的新列。這將產生以下輸出:
ID, Type, sum1, sum2, etc.
ob1, 1, 6, 15, etc.
ob1, 2, 12, 14, etc.
使用排序,我可以為各個列手動執行此操作,但由于我有很多列,我怎樣才能為每 3 列自動執行此求和(在設定的起點之后)?
uj5u.com熱心網友回復:
你可以這樣做。您將使用以下命令df的原始資料框在哪里:mutate
library(dplyr)
df%>%
mutate(sum1=(a b c),
sum2=(d e f))%>%
select(ID, Type, sum1, sum2, etc)
uj5u.com熱心網友回復:
在基礎 R 中,您可以執行以下操作:
num_cols <- df[-c(1:2)]
cbind(df[1:2], do.call(cbind,
lapply(setNames(seq(1,length(num_cols), 3),
paste0("sum", seq(length(num_cols)/3))), \(a) {
apply(num_cols[a:(a 2)], 1, \(b) sum(as.numeric(gsub(",", "", b))))
})))
因為有逗號,我用來gsub去掉它們,
setNames用來給每一列動態命名,
apply用來lapply總結每一行
ID. Type. sum1 sum2
1 ob1, 1, 6 15
2 ob1, 2, 12 14
uj5u.com熱心網友回復:
library(tidyverse)
n_split <- 3
n_start <- 3
seq(n_split, ncol(df1), n_split) %>%
map(~ select(df1[,-(1:(n_start-1))],(.-(n_split-1)):.)) %>%
map_dfc(rowSums) %>%
set_names(., nm = paste0("sum", seq(ncol(.)))) %>%
bind_cols(df1[,1:(n_start-1)], .)
#> # A tibble: 5 x 4
# ID type sum1 sum2 sum3
# 1 ob1 1 4 12 21
# 2 ob2 2 25 42 51
# 3 ob3 3 46 72 81
# 4 ob4 4 67 102 111
# 5 ob5 5 88 132 141
資料:
df1 <- data.frame(ID = c("ob1", "ob2", "ob3", "ob4", "ob5"),
type = c(1, 2, 3, 4, 5),
a = c(1, 11, 21, 31, 41),
b = c(2, 12, 22, 32, 42),
c = c(3, 13, 23, 33, 43),
d = c(4, 14, 24, 34, 44),
e = c(5, 15, 25, 35, 45),
f = c(6, 16, 26, 36, 46),
g = c(7, 17, 27, 37, 47),
h = c(8, 18, 28, 38, 48),
i = c(9, 19, 29, 39, 49))
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/522750.html
標籤:r数据框子集行和
