我有這張表:
library(rvest)
library(tidyverse)
tables_team_pl <- read_html('https://www.win-or-lose.com/football-team-colours/')
color_table <- tables_team_pl %>% html_table() %>% pluck(1) %>% select(-Away)
還有這個:
table_1 <- structure(list(Team = c("Arsenal", "Aston Villa", "Blackburn",
"Bolton", "Chelsea", "Everton", "Fulham", "Liverpool", "Manchester City",
"Manchester Utd", "Newcastle Utd", "Norwich City", "QPR", "Stoke City",
"Sunderland", "Swansea City", "Tottenham", "West Brom", "Wigan Athletic",
"Wolves")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-20L))
如您所見,第二個表的名稱不完整。例如,Manchester Utd應該Manchester United和第一個表一樣。
所以,我只需要完成第二個表,從第一個表中提取相同的名稱。
所以,我將更正table_1 :曼聯應該改為曼聯,布萊克本應該改為布萊克本流浪者,依此類推。完整的名稱應該來自第一個表。
同樣在第二張桌子上,我有 QPR,應該是“Queens Park Rangers”。
有什么幫助嗎?
uj5u.com熱心網友回復:
我們可以使用strindist連接
library(fuzzyjoin)
library(dplyr)
stringdist_left_join(table_1, color_table, by = "Team", method = "soundex") %>%
transmute(Team = coalesce(Team.y, Team.x)) %>%
distinct
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/356535.html
上一篇:按組匯總值并以其他兩列為條件
