我正在將下表1匯入到 R 中,但在格式方面很掙扎,因為每列都有兩個標題。我想要的輸出是第二個表2。我打算使用 tidyr 來收集資料。
我遇到的另一個障礙是合并的單元格。我一直在使用 fillMergedCells=TRUE 來復制這個。
read.xlsx(xlsxFile ="C:/Users/X/X/Desktop/X.xlsx",fillMergedCells = TRUE)
uj5u.com熱心網友回復:
一種選擇是
- 使用選項讀取您的 excel 檔案
colNames = FALSE - 將前兩行粘貼在一起,并將結果用作列名。在這里,我使用下劃線作為分隔符,以便稍后輕松拆分名稱。
- 去掉前兩行
- 使用
tidyr::pivot_longer轉換為長格式。
# df <- openxlsx::read.xlsx(xlsxFile ="data/test2.xlsx", fillMergedCells = TRUE, colNames = FALSE)
# Use first two rows as names
names(df) <- paste(df[1, ], df[2, ], sep = "_")
names(df)[1] <- "category"
# Get rid of first two rows and columns containing year average
df <- df[-c(1:2), ]
df <- df[, !grepl("^Year", names(df))]
library(tidyr)
library(dplyr)
df %>%
pivot_longer(-category, names_to = c("Time", ".value"), names_pattern = "^(.*?)_(.*)$") %>%
arrange(Time)
#> # A tibble: 16 × 4
#> category Time Y Z
#> <chr> <chr> <chr> <chr>
#> 1 Total Feb-21 1 1
#> 2 A Feb-21 2 2
#> 3 B Feb-21 3 3
#> 4 C Feb-21 4 4
#> 5 D Feb-21 5 5
#> 6 E Feb-21 6 6
#> 7 F Feb-21 7 7
#> 8 G Feb-21 8 8
#> 9 Total Jan-21 1 1
#> 10 A Jan-21 2 2
#> 11 B Jan-21 3 3
#> 12 C Jan-21 4 4
#> 13 D Jan-21 5 5
#> 14 E Jan-21 6 6
#> 15 F Jan-21 7 7
#> 16 G Jan-21 8 8
資料
df <- structure(list(X1 = c(
NA, NA, "Total", "A", "B", "C", "D", "E",
"F", "G"
), X2 = c(
"Year Rolling Avg.", "Share", NA, "1", "1",
"1", "1", "1", "1", "1"
), X3 = c(
"Year Rolling Avg.", "Y", "1",
"2", "3", "4", "5", "6", "7", "8"
), X4 = c(
"Year Rolling Avg.",
"Z", "1", "2", "3", "4", "5", "6", "7", "8"
), X5 = c(
"Jan-21",
"Y", "1", "2", "3", "4", "5", "6", "7", "8"
), X6 = c(
"Jan-21",
"Z", "1", "2", "3", "4", "5", "6", "7", "8"
), X7 = c(
"Feb-21",
"Y", "1", "2", "3", "4", "5", "6", "7", "8"
), X8 = c(
"Feb-21",
"Z", "1", "2", "3", "4", "5", "6", "7", "8"
)), row.names = c(
NA,
10L
), class = "data.frame")
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/397036.html
