假設我有一個dt.recipes包含各種專案的串列的資料表,例如:
recipe_id ingredients
1 apple, banana, cucumber, water
2 apple, meat, water
3 water
如何創建一個表格,計算其中存在的唯一專案的數量dt.recipes$ingredients?換句話說,我正在尋找與此類似的結果:
ingredient count
water 3
apple 2
banana 1
cucumber 1
meat 1
任何指標將不勝感激,在此先感謝!
uj5u.com熱心網友回復:
你可以做:
as.data.frame(table(unlist(strsplit(df$ingredients, ", "))))
#> Var1 Freq
#> 1 apple 2
#> 2 banana 1
#> 3 cucumber 1
#> 4 meat 1
#> 5 water 3
資料
df <- structure(list(recipe_id = 1:3,
ingredients = c("apple, banana, cucumber, water",
"apple, meat, water",
"water")),
class = "data.frame", row.names = c(NA, -3L))
df
#> recipe_id ingredients
#> 1 1 apple, banana, cucumber, water
#> 2 2 apple, meat, water
#> 3 3 water
由reprex 包于 2022-03-07 創建(v2.0.1)
uj5u.com熱心網友回復:
具有以下功能tidyverse:
library(tidyverse)
df %>%
separate_rows(ingredients) %>%
count(ingredients, name = "count") %>%
arrange(desc(count))
# A tibble: 5 x 2
# ingredients count
# <chr> <int>
#1 water 3
#2 apple 2
#3 banana 1
#4 cucumber 1
#5 meat 1
uj5u.com熱心網友回復:
一種data.table方法可能是
library(data.table)
dt[, .(table(unlist(ingredients)))]
# V1 N
#1: apple 2
#2: banana 1
#3: cucumber 1
#4: meat 1
#5: water 3
資料
dt <- data.table(
"recipe_id" = 1:3,
"ingredients" = list(
c("apple", "banana", "cucumber", "water"),
c("apple", "meat", "water"),
c("water")
)
)
uj5u.com熱心網友回復:
帶有scan table 的基本 R 選項as.data.frame
> with(df, as.data.frame(table(trimws(scan(text = ingredients, what = "", sep = ",", quiet = TRUE)))))
Var1 Freq
1 apple 2
2 banana 1
3 cucumber 1
4 meat 1
5 water 3
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/439255.html
上一篇:從字典值中獲取子字串
下一篇:Python:回圈后輸出錯誤
