我有一個看起來像這樣的資料框。
--- ------ ------ ------ ------
| | Name | col1 | col2 | col3 |
--- ------ ------ ------ ------
| 1 | A | 10 | 0 | 0 |
| 2 | B | 5 | 20 | 5 |
| 3 | C | 15 | 15 | 20 |
| 4 | D | 20 | 5 | 15 |
| 5 | F | 0 | 10 | 15 |
--- ------ ------ ------ ------
我想要每列的最大值名稱。預期的輸出應如下所示:
--- ------ ------
| | col | MAX |
--- ------ ------
| 1 | col1 | D |
| 2 | col2 | B |
| 3 | col3 | C |
--- ------ ------
我該如何編碼?
uj5u.com熱心網友回復:
資料表
library(data.table)
setDT(df)
df2 = melt(df, id.vars="Name", variable.name="col")
df2 = df2[, .SD[which.max(value)], by = col][, c("col", "Name")]
names(df2)[2] = "MAX"
輸出:
df2
col MAX
1: col1 D
2: col2 B
3: col3 C
dplyr
library(dplyr)
df2 = df %>%
gather(key="col", value="Value", 2:4) %>%
top_n(1, Value) %>%
rename_at(1, ~"MAX") %>%
select(c("col", "MAX"))
輸出:
df2
col MAX
1 col1 D
2 col2 B
3 col3 C
堿基R
它也許還可以更簡單或更美觀……
df2 = reshape(df, direction="long", varying=2:4, v.names="value")
df2 = df2[order(-df2$value), ]
df2 = df2[!duplicated(df2$time), c("time", "Name")]
names(df2) = c("col", "MAX")
df2$col = paste0("col", df2$col)
rownames(df2) = NULL
輸出:
df2
col MAX
1 col1 D
2 col2 B
3 col3 C
uj5u.com熱心網友回復:
中base R,我們可以做
stack(sapply(df1[-1], \(x) df1$Name[which.max(x)]))[2:1]
ind values
1 col1 D
2 col2 B
3 col3 C
資料
df1 <- structure(list(Name = c("A", "B", "C", "D", "F"), col1 = c(10L,
5L, 15L, 20L, 0L), col2 = c(0L, 20L, 15L, 5L, 10L), col3 = c(0L,
5L, 20L, 15L, 15L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/420136.html
標籤:
下一篇:散點餅圖:圓圈未正確定位在地圖上
