我在這里有一個基因位點串列,其中包含編碼為三位數字的等位基因,作為類字符。我有幾行代碼可以遍歷串列并將所有實體轉換為核酸堿基字母(即 A、C、G、T)。
my_allele_list = list(loc1 = c("001", "002"),
loc2 = c("001", "003"),
loc3 = c("004", "001"),
loc4 = c("003", "003"),
loc5 = c("001", "002"),
loc6 = c("002", "004"))
a = c("001", "002", "003", "004")
b = c("A", "C", "G", "T")
for(i in seq_along(a)) my_allele_list <-
lapply(my_allele_list, function(x) gsub(a[i], b[i], x))
my_allele_list
到目前為止一切順利,但為了保持整潔,我想將這些行包裝成一個函式。
convert_alleles <- function(x){
a = c("001", "002", "003", "004")
b = c("A", "C", "G", "T")
for(i in seq_along(a)) x <-
lapply(x, function(x) gsub(a[i], b[i], x))
}
convert_alleles(my_allele_list)
my_allele_list
但是,正如您在第二次傳遞中所看到的,該函式不起作用 - 沒有錯誤,只是沒有對串列物件進行任何更改。我懷疑問題是與 for 回圈中的匿名函式發生沖突。有人可以解釋問題所在并提出解決方案嗎?
uj5u.com熱心網友回復:
使用矢量化函式可能更容易,因為 str_replace
library(dplyr)
library(purrr)
library(stringr)
map(my_allele_list, ~ str_replace_all(.x, setNames(b, a)))
-輸出
$loc1
[1] "A" "C"
$loc2
[1] "A" "G"
$loc3
[1] "T" "A"
$loc4
[1] "G" "G"
$loc5
[1] "A" "C"
$loc6
[1] "C" "T"
此外,如果它是固定匹配而不是示例中的部分匹配,則用于setNames創建命名向量并匹配和替換
map(my_allele_list, ~ unname(setNames(b, a)[.x]))
$loc1
[1] "A" "C"
$loc2
[1] "A" "G"
$loc3
[1] "T" "A"
$loc4
[1] "G" "G"
$loc5
[1] "A" "C"
$loc6
[1] "C" "T"
這也可以用base R-lapply
lapply(my_allele_list, \(x) unname(setNames(b, a)[x]))
$loc1
[1] "A" "C"
$loc2
[1] "A" "G"
$loc3
[1] "T" "A"
$loc4
[1] "G" "G"
$loc5
[1] "A" "C"
$loc6
[1] "C" "T"
在 OP 的函式中,return值應該是x
convert_alleles <- function(x){
a = c("001", "002", "003", "004")
b = c("A", "C", "G", "T")
for(i in seq_along(a)) x <-
lapply(x, function(x) gsub(a[i], b[i], x))
x
}
convert_alleles(my_allele_list)
$loc1
[1] "A" "C"
$loc2
[1] "A" "G"
$loc3
[1] "T" "A"
$loc4
[1] "G" "G"
$loc5
[1] "A" "C"
$loc6
[1] "C" "T"
注意:當我們運行函式時,它不會改變物件my_allele_list。為此,我們分配了 ( <-)
my_allele_list <- convert_alleles(my_allele_list)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/391623.html
上一篇:將陣列從C中的函式回傳到main
