我有個問題 :
我有這個 DF
#> # A tibble: 4 × 3
#> family name items
#> <chr> <chr> <list>
#> 1 Kelly Mark book, ring, necklace
#> 2 Kelly Scott axe, camera, watch
#> 3 Quin Tegan book, camera, watch
#> 4 Quin Sara sword, fork, book
如何將該資料框內的串列中的每個專案計入 Total 中,如下所示:count(book) = 3 count(camera) = 2 等
我應該將整個專案擴大到新列嗎?如果我的問題太基本了,我真的很抱歉,因為我對資料處理真的很陌生
謝謝你
#My Approach 我嘗試使用更長時間的樞軸,但列變得太多。該串列包含數百個值,處理如此大的資料似乎很困擾我。我還沒有嘗試過其他解決方案。
uj5u.com熱心網友回復:
從您的問題中,您提到您不想擴大范圍,因為它會創建大量列。一種替代方法是將計數放入串列中:
count <- as.list(table(unlist(df$items)))
count$book
[1] 3
注意:這是所有行的計數,這是您的帖子建議您正在尋找的內容。
uj5u.com熱心網友回復:
也許是這樣的:
df %>%
unnest(items) %>%
unnest(items) %>%
count(items, name="count")
items count
<chr> <int>
1 axe 1
2 book 3
3 camera 2
4 fork 1
5 necklace 1
6 ring 1
7 sword 1
8 watch 2
uj5u.com熱心網友回復:
library(dplyr)
library(tidyr)
df <- tibble(
family = c("Kelly", "Kelly", "Quin", "Quin"),
name = c("Mark", "Scott", "Tegan", "Sara"),
items = c("book, ring, necklace",
"axe, camera, watch",
"book, camera, watch",
"sword, fork, book"))
df %>% separate(items, into = c("i1", 'i2', 'i3')) %>%
pivot_longer(cols = i1:i3, names_to = "item_order", values_to = "item") %>%
count(item, sort = TRUE)
uj5u.com熱心網友回復:
到目前為止,其他答案使用字符“專案”列,而海報指定了串列列。list-column 可以不使用 tidyverse 函式進行嵌套,然后按如下方式計算:
library(dplyr)
df <- tibble::tribble(
~family, ~name, ~items,
"Kelly", "Mark", list("book", "ring", "necklace"),
"Kelly", "Scott", list("axe", "camera", "watch"),
"Quin", "Tegan", list("book", "camera", "watch"),
"Quin", "Sara", list("sword", "fork", "book")
)
df %>%
tidyr::unnest_longer(items) %>%
count(items)
#> # A tibble: 8 × 2
#> items n
#> <chr> <int>
#> 1 axe 1
#> 2 book 3
#> 3 camera 2
#> 4 fork 1
#> 5 necklace 1
#> 6 ring 1
#> 7 sword 1
#> 8 watch 2
使用reprex v2.0.2創建于 2022-10-26
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/520914.html
標籤:r数据框
