我需要生成一個表格,計算每個站點的新因子水平。我的代碼是這樣的
# Data creation
f = c("red", "green", "blue", "orange", "yellow")
f = factor(f)
d = data.frame(
site = 1:10,
color1= c(
"red", "red", "green", "green", "green",
"blue","green", "blue", "orange", "yellow"
),
color2= c(
"green", "green", "green", "blue","green",
"blue", "orange", "yellow","red", "red"
)
)
d$color1 = factor( d$color1 , levels = levels(f) )
d$color2 = factor( d$color2 , levels = levels(f) )
d
它向我展示了這張表

我需要計算每個新站點中有多少新顏色。只算第一次出現,不重復。結果是這樣的一張桌子。

計算每個站點的不重復顏色在此圖中。

有沒有dplyr方法來找到這個輸出?
uj5u.com熱心網友回復:
你可以做:
library(tidyverse)
d %>%
pivot_longer(cols = -site) %>%
mutate(newColors = duplicated(value)) %>%
group_by(site) %>%
mutate(newColors = sum(!newColors)) %>%
ungroup() %>%
pivot_wider()
這使:
# A tibble: 10 x 4
site newColors color1 color2
<int> <int> <fct> <fct>
1 1 2 red green
2 2 0 red green
3 3 0 green green
4 4 1 green blue
5 5 0 green green
6 6 0 blue blue
7 7 1 green orange
8 8 1 blue yellow
9 9 0 orange red
10 10 0 yellow red
請注意,這與第 9 行不同,其中您有1,但兩種顏色(橙色和紅色)都已出現在前幾行中。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/343022.html
上一篇:R資料框中的重復字串
