我有兩個資料框:
分布:
label1 label2 dist sameCol ID1 ID2
193 194 0.7219847 NA N53 <NA>
193 195 0.5996300 FALSE N53 N43
193 196 0.2038451 FALSE N5 N45
194 195 0.2190454 NA <NA> N43
194 196 0.8894645 NA <NA> N45
195 196 0.7910169 TRUE N38 N5
網路距離:
ID1 ID2 colony value networkDist
N38 N5 10 0.05 1
N36 N5 10 0.03 1
N4 N3 12 10.00 1
N4 N5 12 10.00 1
N4 N15 12 5.00 1
N15 N14 12 5.00 1
我正在嘗試加入它們,如果dists$sameCol == TRUE&& ID1 和 ID2 匹配,然后粘貼來自 networkDistances 的列(所有其他行應該是 NA),看起來像:
label1 label2 dist sameCol ID1 ID2 colony value networkDist
193 194 0.7219847 NA N53 <NA> NA NA NA
193 195 0.5996300 FALSE N53 N43 NA NA NA
193 196 0.2038451 FALSE N5 N45 NA NA NA
194 195 0.2190454 NA <NA> N43 NA NA NA
194 196 0.8894645 NA <NA> N45 NA NA NA
195 196 0.7910169 TRUE N38 N5 10 0.05 1
我已經嘗試過這些但它們不起作用,它們將一些資訊粘貼到行中dists$sameCol == FALSE
r <- left_join(dists, networkDistances, by = c("ID1" = "ID1", "ID2" = "ID2"))
r <- left_join(dists, networkDistances, by = c("ID1" = "ID1", "ID2" = "ID2")) %>%
mutate(networkDist = case_when(sameCol %in% T ~ networkDist))
r <-dists %>%
left_join(networkDistances, by = c("ID1","ID2"))%>%
mutate(networkDist = case_when(sameCol== T ~ networkDist))
uj5u.com熱心網友回復:
在合并之前,添加一個sameCol包含所有TRUE值的列networkDists并將其用作附加鍵:
library(dplyr)
left_join(
dists,
mutate(networkDistances, sameCol = TRUE),
by = c("ID1", "ID2", "sameCol")
)
# A tibble: 6 × 9
label1 label2 dist sameCol ID1 ID2 colony value networkDist
<dbl> <dbl> <dbl> <lgl> <chr> <chr> <dbl> <dbl> <dbl>
1 193 194 0.722 NA N53 <NA> NA NA NA
2 193 195 0.600 FALSE N53 N43 NA NA NA
3 193 196 0.204 FALSE N5 N45 NA NA NA
4 194 195 0.219 NA <NA> N43 NA NA NA
5 194 196 0.889 NA <NA> N45 NA NA NA
6 195 196 0.791 TRUE N38 N5 10 0.05 1
uj5u.com熱心網友回復:
r <- left_join(dists, networkDistances, by = c("ID1", "ID2"))
r[r$sameCol != TRUE | is.na(r$sameCol), c("colony", "value", "networkDist")] <- NA
第一行進行連接(對于您的示例,實作所需的輸出)。第二行將那些列修改為 NA 對于任何非 TRUE sameCol,包括那些帶有NA.
uj5u.com熱心網友回復:
首先檢查條件,然后進行合并。我在示例中添加了更多行,以明確它省略了否定情況。
dists
label1 label2 dist sameCol ID1 ID2
1 193 194 0.7219847 NA N53 <NA>
2 193 195 0.5996300 FALSE N53 N43
3 193 196 0.2038451 FALSE N5 N45
4 194 195 0.2190454 NA <NA> N43
5 194 196 0.8894645 NA <NA> N45
6 195 196 0.7910169 TRUE N38 N5
7 195 196 0.7910169 FALSE N38 N5
8 195 196 0.7910169 TRUE N36 N5
獲取子集
dists_flt <- dists[c(with(dists, which(!(ID1 %in% networkDistances$ID1 & ID2 %in% networkDistances$ID2))),
with(dists, which((ID1 %in% networkDistances$ID1 & ID2 %in% networkDistances$ID2 & sameCol == T)))),]
以R 為底
merge(dists_flt, networkDistances, c("ID1", "ID2"), all.x = T)
ID1 ID2 label1 label2 dist sameCol colony value networkDist
1 <NA> N43 194 195 0.2190454 NA NA NA NA
2 <NA> N45 194 196 0.8894645 NA NA NA NA
3 N36 N5 195 196 0.7910169 TRUE 10 0.03 1
4 N38 N5 195 196 0.7910169 TRUE 10 0.05 1
5 N5 N45 193 196 0.2038451 FALSE NA NA NA
6 N53 <NA> 193 194 0.7219847 NA NA NA NA
7 N53 N43 193 195 0.5996300 FALSE NA NA NA
或與dplyr
library(dplyr)
left_join(dists_flt, networkDistances, c("ID1", "ID2"))
label1 label2 dist sameCol ID1 ID2 colony value networkDist
1 193 194 0.7219847 NA N53 <NA> NA NA NA
2 193 195 0.5996300 FALSE N53 N43 NA NA NA
3 193 196 0.2038451 FALSE N5 N45 NA NA NA
4 194 195 0.2190454 NA <NA> N43 NA NA NA
5 194 196 0.8894645 NA <NA> N45 NA NA NA
6 195 196 0.7910169 TRUE N38 N5 10 0.05 1
7 195 196 0.7910169 TRUE N36 N5 10 0.03 1
擴展資料
dists <- structure(list(label1 = c(193L, 193L, 193L, 194L, 194L, 195L,
195L, 195L), label2 = c(194L, 195L, 196L, 195L, 196L, 196L, 196L,
196L), dist = c(0.7219847, 0.59963, 0.2038451, 0.2190454, 0.8894645,
0.7910169, 0.7910169, 0.7910169), sameCol = c(NA, FALSE, FALSE,
NA, NA, TRUE, FALSE, TRUE), ID1 = c("N53", "N53", "N5", "<NA>",
"<NA>", "N38", "N38", "N36"), ID2 = c("<NA>", "N43", "N45", "N43",
"N45", "N5", "N5", "N5")), row.names = c(NA, 8L), class = "data.frame")
networkDistances <- structure(list(ID1 = c("N38", "N36", "N4", "N4", "N4", "N15"),
ID2 = c("N5", "N5", "N3", "N5", "N15", "N14"), colony = c(10L,
10L, 12L, 12L, 12L, 12L), value = c(0.05, 0.03, 10, 10, 5,
5), networkDist = c(1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/531258.html
標籤:r加入dplyr条件语句
下一篇:在多列上使用函式進行變異
