我有一個很大的串列,其中包括從語料庫中提取的術語。
mylist <- list(c("flower"),
c("plant", "animal", "cats", "doggy"),
c("tree", "trees", "cat", "dog"))
提取的術語來自資料框(作為主要詞、相似詞和類別)
ref <- data.frame(id = c(1:5),
main = c("tree", "plant", "flower", "dog", "cat"),
similar = c("trees","plantlike", "flowery", "doggy", "cats"),
category = c("plant", "plant", "plant", "animal", "animal"))
我需要更改串列,以便我有類別而不是單詞。并且可能會洗掉這樣的重復項...
needed <- list("plant",
c("plant", "animal", "animal", "animal"),
c("plant", "plant", "animal", "animal"))
orbetter <- list("plant",
c("plant", "animal"),
c("plant", "animal"))
但我不知道如何為串列的每個元素申請。我感謝您的幫助。
uj5u.com熱心網友回復:
mylist <- list(c("flower"),
c("plant", "animal", "cats", "doggy"),
c("tree", "trees", "cat", "dog"))
ref <- data.frame(id = c(1:5),
main = c("tree", "plant", "flower", "dog", "cat"),
similar = c("trees","plantlike", "flowery", "doggy", "cats"),
category = c("plant", "plant", "plant", "animal", "animal"))
library(tidyr)
ref_long <- ref %>%
pivot_longer(-c(id, category))
lapply(mylist, function(x) unique(ref_long$category[match(x, table = ref_long$value)]))
#> [[1]]
#> [1] "plant"
#>
#> [[2]]
#> [1] "plant" NA "animal"
#>
#> [[3]]
#> [1] "plant" "animal"
由reprex 包于 2022-01-14 創建(v2.0.1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/413402.html
標籤:
