我是 R 的新手,有一個包含兩個欄位的資料。我需要計算第一個欄位元素出現在第二個欄位中的次數。第二個欄位可以包含多個元素,因為下面的代碼沒有給出正確的答案。請告訴如何修改它或我可以在這里使用什么功能。A1 的計數應為 3,但由于 A1 中存在 A1;A2 和 A3;A1 在此代碼中無法識別,因此它為 1。謝謝。
df0 <- data.frame (ID = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)
n1 <- nrow(df0)
df1 = data.frame(matrix(
vector(), 0, 2, dimnames=list(c(), c("ID","Count"))),
stringsAsFactors=F)
for (i in 1:n1){
id <- df0$ID[i]
df2 <- filter(df0, Refer == id) # This assumes only a single ID can be there in Refer
n2 <- nrow(df2)
df1[i,1] <- id
df1[i,2] <- n2
}
uj5u.com熱心網友回復:
你快到了。雖然,您應該使用grepl()而不是精確過濾Refer == id。
library(dplyr)
df0 <- data.frame (ID = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)
result <- lapply(df0$ID, function(x){
n = df0 %>% filter(grepl(x, Refer)) %>% nrow
data.frame(ID = x, count = n)
}) %>%
bind_rows
uj5u.com熱心網友回復:
你可能會strsplit "Refer"和;它unlist。接下來factor使用"Id"as 級別和簡單table的結果創建它。
table(factor(unlist(strsplit(df0$Refer, ';')), levels=df0$ID))
# A1 A2 A3 A4 B1 C1 D1
# 3 3 1 0 0 1 0
uj5u.com熱心網友回復:
這是一個tidyverse解決方案:
df0 %>%
separate_rows(Refer) %>%
mutate(x = str_detect(Refer, pattern)) %>%
filter(x == TRUE) %>%
count(Refer)
Refer n
<chr> <int>
1 A1 3
2 A2 3
3 A3 1
4 C1 1
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/492570.html
上一篇:簡單的DNA模式匹配
