我有一個主要是分類值的表,并且只想保留在特定列中具有最常見值的行。我正在嘗試使用 slice_max() 但它沒有像我預期的那樣作業。我確實看到了有關如何在基礎 R 中執行此操作或使用已棄用的 top_n() 的較早建議,但 top_n() 檔案說要使用 slice_max 代替,我找不到有關 slice_max 如何作業的詳細資訊。
我將使用 starwars 資料集作為示例。兩個最常見的家庭世界是 Naboo,出現 11 次,而 Tatooine,出現 10 次。所以我希望代碼說“向我顯示兩個最常見的家庭世界的所有行”,我希望這會給我一個 21 行的 tibble家園都是納布和塔圖因。
我添加了一個名為“worldcount”的列,它簡單地計算了 homeworld 的出現次數,因此我可以輕松地查看每個 homeworld 出現的次數。我也只選擇了幾列來簡化事情:
starwars %>%
select(name, sex, homeworld, species) %>%
filter(!is.na(homeworld)) %>%
add_count(homeworld, name="worldcount") %>%
slice_max(worldcount, n=2)
# A tibble: 11 × 5
name sex homeworld species worldcount
<chr> <chr> <chr> <chr> <int>
1 R2-D2 none Naboo Droid 11
2 Palpatine male Naboo Human 11
3 Jar Jar Binks male Naboo Gungan 11
4 Roos Tarpals male Naboo Gungan 11
5 Rugor Nass male Naboo Gungan 11
6 Ric Olié NA Naboo NA 11
7 Quarsh Panaka NA Naboo NA 11
8 Gregar Typho male Naboo Human 11
9 Cordé female Naboo Human 11
10 Dormé female Naboo Human 11
11 Padmé Amidala female Naboo Human 11
但是這個代碼只回傳 Naboo 是 homeworld 的行。當我在 slice_max() 中設定 n=2 時,我期望前 2 個 homeworlds - 但 Tatoine 不在這里?
我還嘗試在包含分類資料的列上直接使用 slice_max(),但我認為這可能是根據字母順序計算“最大值”,因為它回傳兩個以字母開頭的母世界:
starwars %>%
select(name, sex, homeworld, species) %>%
filter(!is.na(homeworld)) %>%
slice_max(homeworld, n=2)
A tibble: 2 × 4
name sex homeworld species
<chr> <chr> <chr> <chr>
1 Zam Wesell female Zolan Clawdite
2 Dud Bolt male Vulpter Vulptereen
最后,我嘗試對 starwars 資料集中已經存在的數字資料使用 slice_max,但這也無法按我預期的方式作業。
如果我要求 8 個最高高度,我會得到我所期望的:9 行,因為星球大戰中的兩個角色具有相同的高度:
starwars %>%
select(name, height) %>%
slice_max(height, n=8)
# A tibble: 9 × 2
name height
<chr> <int>
1 Yarael Poof 264
2 Tarfful 234
3 Lama Su 229
4 Chewbacca 228
5 Roos Tarpals 224
6 Grievous 216
7 Taun We 213
8 Rugor Nass 206
9 Tion Medon 206
所以如果我設定 n=9 并詢問前 9 個高度,我應該得到 10 個不同字符的行,對嗎?但不 - 這會產生完全相同的結果:
starwars %>%
select(name, height) %>%
slice_max(height, n=9)
# A tibble: 9 × 2
name height
<chr> <int>
1 Yarael Poof 264
2 Tarfful 234
3 Lama Su 229
4 Chewbacca 228
5 Roos Tarpals 224
6 Grievous 216
7 Taun We 213
8 Rugor Nass 206
9 Tion Medon 206
那么我是否誤解了 slice_max 的作業原理?
或者有什么不同的方法可以讓我只找到兩個最常見的家庭世界的行嗎?
uj5u.com熱心網友回復:
starwars %>%
count(homeworld, sort = TRUE) %>%
slice(1:2) %>%
left_join(starwars)
結果
Joining, by = "homeworld"
# A tibble: 21 x 15
homeworld n name height mass hair_color skin_color eye_color birth_year sex gender species films vehicles starships
<chr> <int> <chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <list> <list> <list>
1 Naboo 11 R2-D2 96 32 NA white, blue red 33 none masculine Droid <chr [7… <chr [0… <chr [0]>
2 Naboo 11 Palpatine 170 75 grey pale yellow 82 male masculine Human <chr [5… <chr [0… <chr [0]>
3 Naboo 11 Jar Jar Bin… 196 66 none orange orange 52 male masculine Gungan <chr [2… <chr [0… <chr [0]>
4 Naboo 11 Roos Tarpals 224 82 none grey orange NA male masculine Gungan <chr [1… <chr [0… <chr [0]>
5 Naboo 11 Rugor Nass 206 NA none green orange NA male masculine Gungan <chr [1… <chr [0… <chr [0]>
6 Naboo 11 Ric Olié 183 NA brown fair blue NA NA NA NA <chr [1… <chr [0… <chr [1]>
7 Naboo 11 Quarsh Pana… 183 NA black dark brown 62 NA NA NA <chr [1… <chr [0… <chr [0]>
8 Naboo 11 Gregar Typho 185 85 black dark brown NA male masculine Human <chr [1… <chr [0… <chr [1]>
9 Naboo 11 Cordé 157 NA brown light brown NA female feminine Human <chr [1… <chr [0… <chr [0]>
10 Naboo 11 Dormé 165 NA brown light brown NA female feminine Human <chr [1… <chr [0… <chr [0]>
# … with 11 more rows
uj5u.com熱心網友回復:
像這樣的東西?
starwars %>%
select(name, sex, homeworld, species) %>%
filter(!is.na(homeworld)) %>%
count(homeworld, name="worldcount", sort = TRUE) %>%
slice_max(n=2, order_by = worldcount, with_ties = FALSE)
homeworld worldcount
<chr> <int>
1 Naboo 11
2 Tatooine 10
uj5u.com熱心網友回復:
starwars %>%
select(name, sex, homeworld, species) %>%
filter(!is.na(homeworld)) %>%
group_by(homeworld) %>%
count(name="world") %>%
arrange(desc(world)) %>%
ungroup() %>%
slice_max(world, n=2)
uj5u.com熱心網友回復:
slice_max將為您提供最大行數,而不一定是唯一homeworlds 的數量。試試這個:
out <- starwars %>%
filter(
homeworld %in% head(names(sort(table(homeworld), decreasing = TRUE)), 10)
)
out
# # A tibble: 39 x 14
# name height mass hair_color skin_color eye_color birth_year sex gender homeworld species films vehicles starships
# <chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <lis> <list> <list>
# 1 Luke Skywalker 172 77 blond fair blue 19 male mascu~ Tatooine Human <chr~ <chr [2~ <chr [2]>
# 2 C-3PO 167 75 NA gold yellow 112 none mascu~ Tatooine Droid <chr~ <chr [0~ <chr [0]>
# 3 R2-D2 96 32 NA white, bl~ red 33 none mascu~ Naboo Droid <chr~ <chr [0~ <chr [0]>
# 4 Darth Vader 202 136 none white yellow 41.9 male mascu~ Tatooine Human <chr~ <chr [0~ <chr [1]>
# 5 Leia Organa 150 49 brown light brown 19 fema~ femin~ Alderaan Human <chr~ <chr [1~ <chr [0]>
# 6 Owen Lars 178 120 brown, gr~ light blue 52 male mascu~ Tatooine Human <chr~ <chr [0~ <chr [0]>
# 7 Beru Whitesun lars 165 75 brown light blue 47 fema~ femin~ Tatooine Human <chr~ <chr [0~ <chr [0]>
# 8 R5-D4 97 32 NA white, red red NA none mascu~ Tatooine Droid <chr~ <chr [0~ <chr [0]>
# 9 Biggs Darklighter 183 84 black light brown 24 male mascu~ Tatooine Human <chr~ <chr [0~ <chr [1]>
# 10 Anakin Skywalker 188 84 blond fair blue 41.9 male mascu~ Tatooine Human <chr~ <chr [2~ <chr [3]>
# # ... with 29 more rows
table(out$homeworld)
# Alderaan Aleen Minor Corellia Coruscant Kamino Kashyyyk Mirial Naboo Ryloth Tatooine
# 3 1 2 3 3 2 2 11 2 10
堿基R
subset(starwars,
homeworld %in% head(names(sort(table(homeworld), decreasing=TRUE)), 10))
uj5u.com熱心網友回復:
另一種可能的方法:
library(tidyverse)
sw <- starwars %>%
select(name, sex, homeworld, species) %>%
filter(!is.na(homeworld)) %>%
add_count(homeworld)
counts <- unique(sw$n) %>% sort(decreasing = TRUE)
sw %>%
filter(n %in% counts[1:2])
#> # A tibble: 21 × 5
#> name sex homeworld species n
#> <chr> <chr> <chr> <chr> <int>
#> 1 Luke Skywalker male Tatooine Human 10
#> 2 C-3PO none Tatooine Droid 10
#> 3 R2-D2 none Naboo Droid 11
#> 4 Darth Vader male Tatooine Human 10
#> 5 Owen Lars male Tatooine Human 10
#> 6 Beru Whitesun lars female Tatooine Human 10
#> 7 R5-D4 none Tatooine Droid 10
#> 8 Biggs Darklighter male Tatooine Human 10
#> 9 Anakin Skywalker male Tatooine Human 10
#> 10 Palpatine male Naboo Human 11
#> # … with 11 more rows
由reprex 包于 2022-01-23 創建(v2.0.1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/420134.html
標籤:
上一篇:將串列轉換為串列串列
