我有一個這樣的資料集。我想識別在“顏色”列中具有多個值的所有觀察結果,并將它們替換為“多色”
ID color1 color2
23 red NA
44 blue purple
51 yellow NA
59 green orange
像這樣:
ID color
23 red
44 multicolor
51 yellow
59 multicolor
任何想法將不勝感激,謝謝!
uj5u.com熱心網友回復:
這是在 tidyverse 中執行此操作的一種方法。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with("color"), values_to = "color", values_drop_na = TRUE) %>%
group_by(ID) %>%
summarize(n = n(),
color = toString(color), .groups = "drop") %>%
mutate(color = if_else(n > 1, "multicolor", color)) %>%
select(-n)
# # A tibble: 4 x 2
# ID color
# <int> <chr>
# 1 23 red
# 2 44 multicolor
# 3 51 yellow
# 4 59 multicolor
我是故意這樣做的。請注意,如果您停在該summarize()行之后,您將獲得實際顏色。
# # A tibble: 4 x 3
# ID n color
# <int> <int> <chr>
# 1 23 1 red
# 2 44 2 blue, purple
# 3 51 1 yellow
# 4 59 2 green, orange
如果您有許多顏色列,而不僅僅是 2 個,這將可以縮放。使用它,有很多方法可以調整這樣的東西。
資料
df <- read.table(textConnection("ID color1 color2
23 red NA
44 blue purple
51 yellow NA
59 green orange"), header = TRUE)
uj5u.com熱心網友回復:
你可以這樣做,假設data是你的資料集。
library(dplyr)
data <- data.frame(ID = c(23, 44, 51, 59),
color1 = c("red", "blue", "yellow", "green"),
color2 = c(NA, "purple", NA, "orange"))
data %>%
mutate(color = ifelse(is.na(color2), color1, "multicolor")) %>%
select(ID, color)
uj5u.com熱心網友回復:
這是一個看似簡單的解決方案:
library(dplyr)
library(stringr)
data %>%
mutate(
# step 1 - paste `color1` and `color2` together and remove " NA":
color = gsub("\\sNA", "", paste(color1, color2)),
# step 2 - count the number of white space characters:
color = str_count(color, " "),
# step 3 - label `color` as "multicolor" where `color` != 0:
color = ifelse(color == 0, color1, "multicolor")) %>%
# remove the obsolete color columns:
select(-matches("\\d$"))
ID color
1 23 red
2 44 multicolor
3 51 yellow
4 59 multicolor
資料:
data <- data.frame(ID = c(23, 44, 51, 59),
color1 = c("red", "blue", "yellow", "green"),
color2 = c(NA, "purple", NA, "orange"))
uj5u.com熱心網友回復:
甲基礎R的方法
# get colors from columns named color*
colo <- paste(names(table(unlist(df1[,grep("color",colnames(df1))]))), collapse="|")
colo
[1] "blue|green|red|yellow|orange|purple"
# match the colors and do the conversion
data.frame(
ID=df1$ID,
color=apply( df1, 1, function(x){
y=x[grep(colo, x)];
if(length(y)>1){y="multicolor"}; y } ) )
ID color
1 23 red
2 44 multicolor
3 51 yellow
4 59 multicolor
資料
df1 <- structure(list(ID = c(23L, 44L, 51L, 59L), color1 = c("red",
"blue", "yellow", "green"), color2 = c(NA, "purple", NA, "orange"
)), class = "data.frame", row.names = c(NA, -4L))
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/388787.html
標籤:r
上一篇:如何淘汰學生少于20人的學校?
