我嘗試計算A's組內 id的數量。
df<- data.frame( id= c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3,3 ,3,3,4,4,4, 5,5,6,6), value= c(NA, NA,"A", "A", NA,NA,"A","A","B","A",NA,NA,"B","A","B","A", NA, NA,"B",NA, NA, NA,NA))
期望輸出
id value number_A
1 NA 2
1 NA 2
1 A 2
1 A 2
2 NA 3
2 NA 3
2 A 3
2 A 3
2 B 3
2 A 3
3 NA 2
3 NA 2
3 B 2
3 A 2
3 B 2
3 A 2
4 NA 0
4 NA 0
4 B 0
5 NA 0
5 NA 0
6 NA 0
6 NA 0
我用下面的代碼試試:
library(dplyr)
df1 <- df %>% group_by(id) %>%
mutate(count = row_number(value=="A"))
uj5u.com熱心網友回復:
你可以用
library(dplyr)
df %>%
group_by(id) %>%
mutate(number_A = sum(value == "A", na.rm = TRUE)) %>%
ungroup()
這回傳
# A tibble: 23 x 3
id value number_A
<dbl> <chr> <int>
1 1 NA 2
2 1 NA 2
3 1 A 2
4 1 A 2
5 2 NA 3
6 2 NA 3
7 2 A 3
8 2 A 3
9 2 B 3
10 2 A 3
# ... with 13 more rows
uj5u.com熱心網友回復:
具有該aggregate功能的基本解決方案。
df<- data.frame( id= c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3,3 ,3,3,4,4,4, 5,5,6,6), value= c(NA, NA,"A", "A", NA,NA,"A","A","B","A",NA,NA,"B","A","B","A", NA, NA,"B",NA, NA, NA,NA))
# Calculate the number of A for each group id:
countA = aggregate(value ~ id, data=df, FUN=function(x){sum(x=="A", na.rm=TRUE))}, na.omit=na.pass)
countA
# id value
# 1 1 2
# 2 2 3
# 3 3 2
# 4 4 0
# 5 5 0
# 6 6 0
# Set the value in countA to "countA" and merge with df
names(countA)[2] = "countA"
merge(df, countA, by="id")
# id value countA
#1 1 <NA> 2
#2 1 <NA> 2
#3 1 A 2
#4 1 A 2
#5 2 <NA> 3
#6 2 <NA> 3
# ...
解釋:
aggregate計算由by變數定義的組的匯總函式。或者,可以通過公式來提供這種關系。
的function(x){sum(x=="A", na.rm=TRUE)}簡單計算的總和A值,并洗掉NAs表示否則會冒泡。
最后,默認情況下,aggregate 會洗掉帶有NAs的值,這會導致某些組無法表示。這是由na.omit=na.pass抑制此行為的規則修復的。
在那之后,我們只是重新命名我們的匯總結果和一列merge兩個data.frames由id列。
uj5u.com熱心網友回復:
另一種解決方案:
df %>%
add_count(id, wt = value=="A", name = "number_A")
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/328189.html
上一篇:去除例外值線性回歸
下一篇:如何使用天數列創建開始和結束列?
