我有一個看起來像這樣的資料
df <- structure(list(Mydf = c("TNFM00000001497", "TNFM00000001617",
"TNFM00000001617", "TNFM00000001617", "TNFM00000001617", "TNFM00000001626",
"TNFM00000001626", "TNFM00000001626", "TNFM00000001626", "TNFM00000001629",
"TNFM00000001629", "TNFM00000001630", "TNFM00000001630", "TNFM00000001630"
)), class = "data.frame", row.names = c(NA, -14L))
我想計算一個字串重復多少次,然后給我一個這樣的示例輸出
String Number of repeat
TNFM00000001497 1
TNFM00000001617 4
TNFM00000001626 4
TNFM00000001629 2
TNFM00000001630 3
uj5u.com熱心網友回復:
嘗試 table
> table(df)
df
TNFM00000001497 TNFM00000001617 TNFM00000001626 TNFM00000001629 TNFM00000001630
1 4 4 2 3
或者
> as.data.frame(table(df))
df Freq
1 TNFM00000001497 1
2 TNFM00000001617 4
3 TNFM00000001626 4
4 TNFM00000001629 2
5 TNFM00000001630 3
或者
> stack(table(df))
values ind
1 1 TNFM00000001497
2 4 TNFM00000001617
3 4 TNFM00000001626
4 2 TNFM00000001629
5 3 TNFM00000001630
uj5u.com熱心網友回復:
另一種方法是:
aggregate(df$Mydf, list(freq = df$Mydf), length)
# freq x
# 1 TNFM00000001497 1
# 2 TNFM00000001617 4
# 3 TNFM00000001626 4
# 4 TNFM00000001629 2
# 5 TNFM00000001630 3
tapply 也可以在這里作業:
tapply(df$Mydf, df$Mydf, length)
uj5u.com熱心網友回復:
使用group_by和count:
library(dplyr)
df %>%
group_by(String = Mydf) %>%
count()
# A tibble: 5 x 2
# Groups: String [5]
String n
<chr> <int>
1 TNFM00000001497 1
2 TNFM00000001617 4
3 TNFM00000001626 4
4 TNFM00000001629 2
5 TNFM00000001630 3
uj5u.com熱心網友回復:
另一種可能的解決方案:
library(tidyverse)
df %>% add_count(Mydf) %>% distinct
#> Mydf n
#> 1 TNFM00000001497 1
#> 2 TNFM00000001617 4
#> 3 TNFM00000001626 4
#> 4 TNFM00000001629 2
#> 5 TNFM00000001630 3
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/409171.html
標籤:
上一篇:消除資料框中的雙引號
下一篇:跨不同資料幀的條件匹配列
