我想從列中提取與包含在大量字符向量中的名稱匹配的名稱。在某些情況下,由于空格,提取的字串不完整。
下面是一個可復制的示例:
library(stringr)
library(dplyr)
library(tidyr)
library(stringi)
data <- data.frame (address = c("to New York street", "to New cafe", "to Paris avenue", "to London hostel"))
search_string<-c("London","Paris", "New", "New York")%>% paste(collapse = " |to ")
data %>% dplyr::mutate(temp_com = str_extract_all(paste(address), search_string))
這是結果:
address temp_com
1 to New York street to New
2 to New cafe to New
3 to Paris avenue to Paris
4 to London hostel London
這就是我想要的:
address temp_com
1 to New York street to New York
2 to New cafe to New
3 to Paris avenue to Paris
4 to London hostel London
非常感謝您的幫助
uj5u.com熱心網友回復:
將搜索字串的順序更改為從最長到最短。(另外,我推斷您打算"to "在您的第一個搜索字串之前擁有,在您當前的示例中被省略了。)
search_string <- c("London","Paris", "New", "New York")
search_string <- paste(paste("to", search_string[order(-nchar(search_string))]), collapse = "|")
search_string
# [1] "to New York|to London|to Paris|to New"
data %>%
dplyr::mutate(temp_com = str_extract_all(paste(address), search_string))
# address temp_com
# 1 to New York street to New York
# 2 to New cafe to New
# 3 to Paris avenue to Paris
# 4 to London hostel to London
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/527001.html
標籤:r
