我已經定義了一個自定義函式并測驗了該函式以確保它可以作業,但我無法將它應用于串列以獲得距離矩陣。
我的代碼是:
library(Biostrings)
library(proxy)
#import the sequences using Biostrings
indf<-readAAStringSet("C:/Users/jamie/OneDrive/Documents/Junk/SAMPLEFASTA.fasta")
#Assign the names and sequences to different variables
seqAAname<-names(indf)
seqz<-paste(indf)
#Put just the sequences into a dataframe
indf2<-data.frame(seqz)
#Convert the sequences into a list
indf3<-as.list(indf2)
#Define a custom function to return the alignment score between two sequences (pairwise)
customalnfunc <- function(X, Y){
pairwiseAlignment(X, Y,
substitutionMatrix = "BLOSUM45", gapOpening = 1, gapExtension = 3)
}
#Test the function but not as a function (This works fine)
testfreefunc<- pairwiseAlignment(AAString("PEHQRSTVE"),AAString("PQHQRETVE"),
substitutionMatrix = "BLOSUM45", gapOpening = 1, gapExtension = 3)
print(testfreefunc@score)
#Test the function as a fucntion to make sure it works (This works fine)
testfuncout <- customalnfunc(AAString("PEHQRSTVE"),AAString("PQHQRETVE"))
print(testfuncout@score)
#Apply the custom function to all possible pairs using proxy::dist with the custom function (This does not work, it returns 0)
outalnmatrix <- proxy::dist(indf3, method = customalnfunc)
outalnmatrix
SAMPLEFASTA.fasta 檔案包含:
>SeqA
PEHQRSTVE
>SeqB
PQHQRETVE
>SeqC
RQHERSEVE
outalnmatrix 的期望輸出是:

我嘗試將輸入資料作為串列和矩陣傳遞給 proxy::dist。
我怎樣才能使這項作業?
uj5u.com熱心網友回復:
您不需要使用該proxy包,因為proxy::dist它旨在將矩陣/資料幀的行相互比較。由于要比較字串,因此可以使用outer. 但是,您需要調整您的customalnfunc函式,使其僅回傳一個數字 ( scoreOnly = TRUE)。
library(Biostrings)
seqz <- c("PEHQRSTVE", "PQHQRETVE", "RQHERSEVE")
customalnfunc <- function(X, Y){
pairwiseAlignment(X, Y,
substitutionMatrix = "BLOSUM45",
gapOpening = 1,
gapExtension = 3,
scoreOnly = TRUE)
}
outer(seqz, seqz, customalnfunc)
#>
[,1] [,2] [,3]
[1,] 58 50 33
[2,] 50 60 33
[3,] 33 33 57
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/432884.html
