后續：將data.frame中給定的缺失列放回dta.frames串列中-有解無憂

我正在跟進這個問題。LIST下面我的 data.frames 是由我的data. 然而，這種LIST缺少paper（列總是提供缺少的列（S）的姓名（或名稱）），它是在原來可用data。

我想知道如何將丟失的paper列放回LIST以實作我的DESIRED_LIST下面？

我嘗試了這個答案( lapply(LIST, function(x)data[do.call(paste, data[names(x)]) %in% do.call(paste, x),])) 中建議的解決方案，但它不會產生我的DESIRED_LIST.

一個 Base R 或 tidyverse 解決方案值得贊賞。

可重現的資料和代碼如下。

m2="
paper     study sample    comp ES bar
1         1     1         1    1  7
1         2     2         2    2  6
1         2     3         3    3  5
2         3     4         4    4  4
2         3     4         4    5  3
2         3     4         5    6  2
2         3     4         5    7  1"
data <- read.table(text=m2,h=T)

        LIST <- list(data.frame(study=1       ,sample=1       ,comp=1),
                     data.frame(study=rep(3,4),sample=rep(4,4),comp=c(4,4,5,5)),
                     data.frame(study=c(2,2)  ,sample=c(2,3)  ,comp=c(2,3)))

DESIRED_LIST <- list(data.frame(paper=1       ,study=1       ,sample=1       ,comp=1),
                     data.frame(paper=rep(2,4),study=rep(3,4),sample=rep(4,4),comp=c(4,4,5,5)),
                     data.frame(paper=rep(1,2),study=c(2,2)  ,sample=c(2,3)  ,comp=c(2,3)))

uj5u.com熱心網友回復：

請找到帶有包的解決方案data.table。這就是你要找的嗎？

代表 1

library(data.table)

cols_to_remove <- c("ES")

split(setDT(data)[, (cols_to_remove) := NULL], by = c("paper", "study"))
#> $`1.1`
#>    paper study sample comp
#> 1:     1     1      1    1
#> 
#> $`1.2`
#>    paper study sample comp
#> 1:     1     2      2    2
#> 2:     1     2      3    3
#> 
#> $`2.3`
#>    paper study sample comp
#> 1:     2     3      4    4
#> 2:     2     3      4    4
#> 3:     2     3      4    5
#> 4:     2     3      4    5

^{由reprex 包(v2.0.1)于 2021 年 11 月 6 日創建}

編輯

請在包中找到解決方案 2 dplyr

代表 2

library(dplyr)

drop.cols <- c("ES")  

data %>% 
  group_by(paper, study) %>% 
  select(-drop.cols) %>% 
  group_split()

#> <list_of<
#>   tbl_df<
#>     paper : integer
#>     study : integer
#>     sample: integer
#>     comp  : integer
#>   >
#> >[3]>
#> [[1]]
#> # A tibble: 1 x 4
#>   paper study sample  comp
#>   <int> <int>  <int> <int>
#> 1     1     1      1     1
#> 
#> [[2]]
#> # A tibble: 2 x 4
#>   paper study sample  comp
#>   <int> <int>  <int> <int>
#> 1     1     2      2     2
#> 2     1     2      3     3
#> 
#> [[3]]
#> # A tibble: 4 x 4
#>   paper study sample  comp
#>   <int> <int>  <int> <int>
#> 1     2     3      4     4
#> 2     2     3      4     4
#> 3     2     3      4     5
#> 4     2     3      4     5

^{由reprex 包(v2.0.1)于 2021 年 11 月 7 日創建}

uj5u.com熱心網友回復：

考慮ave創建一個分組列（由于重復的行），然后運行迭代merge。

DESIRED_LIST_SO <- lapply(
  LIST,
  function(df) merge(
      transform(data, grp = ave(paper, paper, study, sample, comp, FUN=seq_along)),
      transform(df, grp = ave(study, study, sample, comp, FUN=seq_along)),
      by=c("study", "sample", "comp", "grp")
  )[c("paper", "study", "sample", "comp")]
)

all.equal(DESIRED_LIST, DESIRED_LIST_SO)
[1] TRUE

（考慮保留唯一識別符號，ES并bar在所需串列中避免重復行。）

uj5u.com熱心網友回復：

一個tidyverse解決方案。首先，創建一個查找表data2，其中包含四個目標列。mutate(across(.fns = as.numeric))是使列型別一致。它可能不需要。其次，用于map應用left_join到LIST. LIST2并且DESIRED_LIST完全一樣。

data2 <- data %>%
  distinct(paper, study, sample, comp) %>%
  mutate(across(.fns = as.numeric))

LIST2 <- map(LIST, function(x){
  x2 <- x %>%
    left_join(data2, by = names(x)) %>%
    select(all_of(names(data2)))
  return(x2)
})

# Check if the results are the same
identical(DESIRED_LIST, LIST2)
# [1] TRUE

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/351700.html

標籤：r 列表数据框 dplyr 整理宇宙

上一篇：如何在for回圈后使回傳輸出列印一次

下一篇：如何對齊/合并2個串列？（Python）