我想聯合這個資料框,但他們有特定的要求。inner join必須按第二欄和第一個(日期)完成,但日期不一樣。在第一個 data.frame 中,日期必須在 data.frame b 中的日期之前。我也有我想要的結果,可以在 R 中做嗎?
a <- data.frame(one = c( as.Date( "2020-08-24"), as.Date( "2020-08-27" ), as.Date( "2020-08-31" ), as.Date( "2020-09-01" )),
two = c("a","b","b","a"))
b <- data.frame(two = c( as.Date( "2020-08-25"), as.Date( "2020-08-30" ), as.Date( "2020-09-05" ), as.Date( "2020-09-11" )),
three = c("a","b","a","b"))
result <- data.frame(one = c(as.Date( "2020-08-24"), as.Date( "2020-08-27" ), as.Date( "2020-08-31" ), as.Date( "2020-09-01" )),
two = c("a","b","b","a"),
three = c(as.Date("2020-08-25"), as.Date( "2020-08-30"), as.Date("2020-09-11"), as.Date("2020-09-05")))
uj5u.com熱心網友回復:
在dplyr:
library(dplyr)
left_join(a, b, by = c("two" = "three")) %>%
filter(two.y > one) %>%
group_by(one) %>%
slice_min(two.y)
如果您使用dplyr開發版本,有一個更直接的答案,使用join_byin left_join:
devtools::install_github("tidyverse/dplyr")
library(dplyr)
left_join(a, b, join_by(closest(one <= two), two == three))
輸出
# A tibble: 4 × 3
# Groups: one [4]
one two two.y
<date> <chr> <date>
1 2020-08-24 a 2020-08-25
2 2020-08-27 b 2020-08-30
3 2020-08-31 b 2020-09-11
4 2020-09-01 a 2020-09-05
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/529430.html
標籤:r
上一篇:如何為每個主題創建唯一ID?
下一篇:在R中創建具有特定長度的序列
