R：基于“OR”陳述句的連接-有解無憂

我正在使用 R 編程語言。假設我有以下兩個表：

table_1 = data.frame(id = c("123", "123", "125", "125"), id2 = c("11", "12", "14", "13"),
date_1 = c("2010-01-31","2010-01-31", "2015-01-31", "2018-01-31" ))

table_1$id = as.factor(table_1$id)
table_1$id2 = as.factor(table_1$id2)
table_1$date_1 = as.factor(table_1$date_1)

table_2 = data.frame(id = c("123", "123", "125", "125"), id2 = c("111", "112", "14", "113"),
date_2 = c("2009-01-31","2010-01-31", "2010-01-31", "2010-01-31" ),
date_3 = c("2011-01-31","2010-01-31", "2020-01-31", "2020-01-31" ))


table_2$id = as.factor(table_2$id)
table_2$id2 = as.factor(table_2$id2)
table_2$date_2 = as.factor(table_2$date_2)
table_2$date_3 = as.factor(table_2$date_3)

如果滿足以下兩個條件中的任何一個（即，如果 Condition_1 = TRUE OR Condition_2 = TRUE，則“加入” )

條件_1

如果 table_1$id = table_2$id

和

如果 table_1$date BETWEEN(table_2$date_2,table_2$date_3)

條件_2

如果 table_1$id2 = table_2$id2

和

如果 table_1$date BETWEEN(table_2$date_2,table_2$date_3)

我已經嘗試過的：我知道如何分別執行這兩個連接，例如：

library(sqldf)

#Condition_1


final = sqldf("select a.*, b.*
           
       from table_1 a left join table_2 b
       on a.id = b.id and 
          a.date_1 between 
              b.date_2 and
              b.date_3") 


#Condition_2


final_2 = sqldf("select a.*, b.*
           
       from table_1 a left join table_2 b
       on a.id2 = b.id2 and 
          a.date_1 between 
              b.date_2 and
              b.date_3")

然后我可以將這些檔案（“final_3”）“系結”在一起（并洗掉完全重復的行）：

final_3 = rbind(final, final_2)

final_3 = final_3[!duplicated(final_3[c(1,2,3,4,5,6,7)]),]

我的問題：有沒有辦法在一個步驟中將這兩個表合并在一起，而不是兩個單獨的步驟？這可以使用 Base R 或 DPLYR 來完成嗎？

謝謝！

uj5u.com熱心網友回復：

這可以在單個 SQL 陳述句中完成，如下所示。

library(sqldf)

sqldf("select distinct * 
  from table_1 a left join table_2 b
  on (a.date_1 between b.date_2 and b.date_3) and 
     (a.id = b.id or a.id2 = b.id2)")

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/374313.html

標籤：sql r 加入 dplyr 数据处理

上一篇：在R中，當ID列存在但不相關時如何應用VLOOKUP？

下一篇：R：使用BaseR合并“Between”