這有點類似于我之前的問題拆分資料框字串列和計數專案。(dplyr and R) ,但我想知道的是如何拆分列項并將回傳值轉換為向量而不是串列。
library("tidyverse")
dat <- data.frame(ID = c("A", "B"),
gene_ids = c(
"101739/20382/13006/212377/114714/66622/140917",
"75717/103573/14852/18141/12567/26429/20842/17975/12545"
)
)
tmp <- dat %>% mutate(ids = str_split(gene_ids, "/"))
tmp$ids
#> [[1]]
#> [1] "101739" "20382" "13006" "212377" "114714" "66622" "140917"
#>
#> [[2]]
#> [1] "75717" "103573" "14852" "18141" "12567" "26429" "20842" "17975"
#> [9] "12545"
tmp
#> ID gene_ids
#> 1 A 101739/20382/13006/212377/114714/66622/140917
#> 2 B 75717/103573/14852/18141/12567/26429/20842/17975/12545
#> ids
#> 1 101739, 20382, 13006, 212377, 114714, 66622, 140917
#> 2 75717, 103573, 14852, 18141, 12567, 26429, 20842, 17975, 12545
dat %>% mutate(please_be_vector = str_split(gene_ids, "/") %>% unlist())
#> Error: Problem with `mutate()` input `please_be_vector`.
#> x Input `please_be_vector` can't be recycled to size 2.
#> ? Input `please_be_vector` is `str_split(gene_ids, "/") %>% unlist()`.
#> ? Input `please_be_vector` must be size 2 or 1, not 16.
我想tmp$ids成為矢量而不是像下面這樣的串列。這可以使用 dplyr 嗎?
tmp$ids[1]
"101739" "20382" "13006" "212377" "114714" "66622" "140917"
tmp$ids[2]
"75717" "103573" "14852" "18141" "12567" "26429" "20842" "17975" "12545"
是否可以?
uj5u.com熱心網友回復:
我們可以簡單地unclass在嵌套資料上使用,以獲得向量串列
library(dplyr)
dat %>% separate_rows(everything(), sep = "/")%>%
pivot_wider(names_from = ID, values_from = gene_ids)%>%
unclass
$A
$A[[1]]
[1] "101739" "20382" "13006" "212377" "114714" "66622" "140917"
$B
$B[[1]]
[1] "75717" "103573" "14852" "18141" "12567" "26429" "20842" "17975" "12545"
uj5u.com熱心網友回復:
tmp$ids是兩個字符向量的串列,資料的每一行一個。當您使用 對串列進行子集化時[,您會得到一個串列。而是使用[[:
> tmp$ids[[1]]
[1] "101739" "20382" "13006" "212377" "114714" "66622" "140917"
更好地理解這一點的一個很好的資源是Advanced R 中關于子集的章節。
uj5u.com熱心網友回復:
更新: 也許這個:
dat %>%
separate_rows(gene_ids) %>%
arrange(ID, gene_ids) %>%
group_by(ID) %>%
mutate(id = row_number()) %>%
pivot_wider(
names_from = ID,
values_from = gene_ids
) %>%
pull(A) # alternative pull(B)
[1] "101739" "114714" "13006" "140917" "20382" "212377" "66622" NA
[9] NA
第一個回答:
library(tidyverse)
dat %>% mutate(ids = str_split(gene_ids, "/")) %>%
unnest(ids) %>%
pull(ids)
輸出:
[1] "101739" "20382" "13006" "212377" "114714" "66622" "140917" "75717"
[9] "103573" "14852" "18141" "12567" "26429" "20842" "17975" "12545"
或者:
temp <- dat %>% mutate(ids = str_split(gene_ids, "/"))
unlist(tmp$ids)
輸出:
[1] "101739" "20382" "13006" "212377" "114714" "66622" "140917" "75717"
[9] "103573" "14852" "18141" "12567" "26429" "20842" "17975" "12545
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/388506.html
下一篇:在R中排列日期字符
