我有兩個資料集。第一個是
issue_1_t1 <- c(10, 20, 30, 40)
issue_2_t1 <- c(10, 20, 30, 10)
issue_1_t2 <- c(10, 20, 30, 40)
issue_2_t2 <- c(10, 20, 30, 10)
issue_1_t3 <- c(10, 20, 30, 40)
issue_2_t3 <- c(10, 20, 30, 10)
area <- c(area1, area2, area3, area4)
area2 <- c(area10, area20, area30, area40)
df <- data.frame(issue_1_t1, issue_2_t1, issue_1_t2, issue_2_t2, issue_1_t3, issue_2_t3)
我想重新配置這些,以便它們形成以下內容:
area area2 issue1 issue2
area1 area10 10 10
area2 area20 20 20
area3 area30 30 30
area4 area40 40 40
area1 area10 10 10
area2 area20 20 20
area3 area30 30 30
area4 area40 40 40
area1 area10 10 10
area2 area20 20 20
area3 area30 30 30
area4 area40 40 40
到目前為止,我只能按時間段將資料集分成兩個資料集,然后將它們堆疊在一起。我想知道是否有一種只需要一行代碼的更有效的方法。
uj5u.com熱心網友回復:
你可以試試
library(dplyr)
library(reshape2)
library(tidyr)
df %>%
melt %>%
mutate(time = str_sub(variable, -1),
issue = paste0("issue", str_split(variable, "_", simplify = T)[,2])) %>%
select(time, issue, value) %>%
group_by(issue) %>%
arrange(issue) %>%
mutate(n = 1, n = cumsum(n)) %>%
pivot_wider(values_from = value, names_from = issue) %>%
mutate(area = rep(area, max(n)/length(area)), area2 = rep(area2, max(n)/length(area2))) %>%
select(-time, -n)
issue1 issue2 area area2
<dbl> <dbl> <chr> <chr>
1 10 10 area1 area10
2 20 20 area2 area20
3 30 30 area3 area30
4 40 10 area4 area40
5 10 10 area1 area10
6 20 20 area2 area20
7 30 30 area3 area30
8 40 10 area4 area40
9 10 10 area1 area10
10 20 20 area2 area20
11 30 30 area3 area30
12 40 10 area4 area40
uj5u.com熱心網友回復:
我認為共享的資料不完整,有語法錯誤。我不明白你說的兩個資料集是什么意思(當只有一個時df),但我認為你擁有的是這樣的 -
issue_1_t1 <- c(10, 20, 30, 40)
issue_2_t1 <- c(10, 20, 30, 10)
issue_1_t2 <- c(10, 20, 30, 40)
issue_2_t2 <- c(10, 20, 30, 10)
issue_1_t3 <- c(10, 20, 30, 40)
issue_2_t3 <- c(10, 20, 30, 10)
area <- c("area1", "area2", "area3", "area4")
area2 <- c("area10", "area20", "area30", "area40")
df <- data.frame(area, area2, issue_1_t1, issue_2_t1, issue_1_t2,
issue_2_t2, issue_1_t3, issue_2_t3)
df
# area area2 issue_1_t1 issue_2_t1 issue_1_t2 issue_2_t2 issue_1_t3 issue_2_t3
#1 area1 area10 10 10 10 10 10 10
#2 area2 area20 20 20 20 20 20 20
#3 area3 area30 30 30 30 30 30 30
#4 area4 area40 40 10 40 10 40 10
您可以使用pivot_longerfromtidyr使其具有所需的形狀。
tidyr::pivot_longer(df,
cols = starts_with('issue'),
names_to = '.value',
names_pattern = '(issue_\\d )')
# area area2 issue_1 issue_2
# <chr> <chr> <dbl> <dbl>
# 1 area1 area10 10 10
# 2 area1 area10 10 10
# 3 area1 area10 10 10
# 4 area2 area20 20 20
# 5 area2 area20 20 20
# 6 area2 area20 20 20
# 7 area3 area30 30 30
# 8 area3 area30 30 30
# 9 area3 area30 30 30
#10 area4 area40 40 10
#11 area4 area40 40 10
#12 area4 area40 40 10
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/352487.html
上一篇:R中基于組的比率計算
