在R資料框中的每一行之后添加行-有解無憂

我有看起來像這樣的資料：

X            snp_id        is_severe encoding_1 encoding_2 encoding_0
1     0  GL000191.1-37698         0          0          1          7
3     2  GL000191.1-37922         1          1          0         12

我希望做的是在包含前一個 snp_id 值的每一行之后添加一個新行，如果前一個為 0，則 is_sever 值將為 1，如果前一個為 1，則為 0（目標是 snp_id 的每個值將在 is_severe 列中有零和一，而不僅僅是零或一（并且每個 snp_id 將出現兩次，is_sever =zero 和一次 is_sever=1 資料中的所有 snp_id 值都是唯一的）。此外， encoding_1 和 ancoding_2 將有值 0 和 encoding_0 列將遵循等式：如果在新行中 is_severe 值為 0，則 encoding_0 將 =8，如果在新行中 is_severe 值為 1，則 encoding_0 將 =13

所需輸出的示例：

X            snp_id            is_severe encoding_1 encoding_2 encoding_0
    1     0  GL000191.1-37698         0          0          1          7
    2     1  GL000191.1-37698         1          0          0          13  <- new row 
    3     2  GL000191.1-37922         1          1          0         12
    4     3  GL000191.1-37922         0          0          0          8  <- new row

我在這里看到了類似的 QA：如何每隔一行將行添加到 R 資料框中？但我需要做更多的資料操作，不幸的是這個解決方案不能解決我的問題。謝謝你：）

uj5u.com熱心網友回復：

這里有兩個選項。1）拆分和映射，2）復制和系結

library(tidyverse)

dat <- read_table("snp_id        is_severe encoding_1 encoding_2 encoding_0
GL000191.1-37698         0          0          1          7
GL000191.1-37922         1          1          0         12")

dat |>
  group_split(snp_id) |>
  map_dfr(~add_row(.x, 
                   snp_id = .x$snp_id,
                   is_severe = 1 - (.x$is_severe == 1),
                   encoding_1 = 0, 
                   encoding_2 = 0,
                   encoding_0 = ifelse(.x$is_severe == 1, 8, 13)))
#> # A tibble: 4 x 5
#>   snp_id           is_severe encoding_1 encoding_2 encoding_0
#>   <chr>                <dbl>      <dbl>      <dbl>      <dbl>
#> 1 GL000191.1-37698         0          0          1          7
#> 2 GL000191.1-37698         1          0          0         13
#> 3 GL000191.1-37922         1          1          0         12
#> 4 GL000191.1-37922         0          0          0          8

或者

library(tidyverse)


bind_rows(dat,
          dat |> 
            mutate(is_severe = 1 - (is_severe == 1),
                   across(c(encoding_1, encoding_2), ~.*0),
                   encoding_0 = ifelse(is_severe == 1, 13, 8))) |>
            arrange(snp_id)
#> # A tibble: 4 x 5
#>   snp_id           is_severe encoding_1 encoding_2 encoding_0
#>   <chr>                <dbl>      <dbl>      <dbl>      <dbl>
#> 1 GL000191.1-37698         0          0          1          7
#> 2 GL000191.1-37698         1          0          0         13
#> 3 GL000191.1-37922         1          1          0         12
#> 4 GL000191.1-37922         0          0          0          8

uj5u.com熱心網友回復：

虛擬資料：

df <- data.frame(
  a = letters[1:4], 
  is_severe = sample(c(0,1), 4, TRUE),
  encoding1 = sample(c(0,1), 4, TRUE),
  encoding2 = sample(c(0,1), 4, TRUE),
  encoding0 = 1:4
)

您可以復制資料進行計算并與原始資料系結（之后進行所需的行排列）：

df_copy <- df
df_copy$is_severe <- 1 - df_copy$is_severe
df_copy[, c("encoding1", "encoding2")] <- 0
df_copy$encoding0 <- ifelse(df_copy$is_severe == 0, 8 , 13)

rbind(df, df_copy)[rep(seq_len(nrow(df)), each = 2)   rep(c(0, nrow(df)), times = nrow(df)),]

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/521653.html

標籤：r数据框通过...分组排添加

上一篇：使用geom_density繪制多個密度，但使用連續變數填充

下一篇：R，dplyr。分組表的總百分比