使用 ggplot2,我想在二維中繪制兩個向量(vec1_num、vec2_num)并通過組變數(vec3_char)為點著色。一些資料點重疊。
library(ggplot2)
vec1_num = c(1,2,3,4,1,3,4,5,5,5)
vec2_num = c(1,2,3,4,1,3,4,5,5,5)
vec3_char = c("A", "B", "C", "A", "B", "C", "C", "A", "B", "C")
# plot 1
ggplot(data = NULL)
geom_point(aes(x=vec1_num, y=vec2_num, colour=vec3_char), alpha=0.4, size=4)
scale_colour_manual(values=c("A"="darkblue", "B"="darkred", "C"="orange"))
theme(panel.grid = element_blank())
我知道我可以通過減少 alpha 或使用 geom_jitter 添加一點噪音來減弱重疊。像這樣:
# plot 2
ggplot(data = NULL)
geom_jitter(aes(x=vec1_num, y=vec2_num, colour=vec3_char), alpha=0.4, size=4, width = 0.1)
scale_colour_manual(values=c("A"="darkblue", "B"="darkred", "C"="orange"))
theme(panel.grid = element_blank())
但是,是否可以使用繪圖 1 但為重疊點著色不同?因此,例如,“A”=“深藍色”、“AB”=“黑色”、“ABC”=“灰色”、“B”=“深紅色”、“BC”=“粉紅色”、“C”=橙色”?我還可以添加一個小的維恩圖(圖例)來可視化點重疊的顏色選擇嗎?
謝謝!
uj5u.com熱心網友回復:
我這樣做的方法是將字母轉換為數字,將它們相加并轉換回字母。
注意,一個復雜的問題是字母必須是 A、B、D、H ……所以每個數字組合只有一種方法。雖然可能有一種方法可以從 A、B、C 開始,并編碼為唯一值
library(tidyverse)
vec1_num = c(1,2,3,4,1,3,4,5,5,5)
vec2_num = c(1,2,3,4,1,3,4,5,5,5)
vec3_char = c("A", "B", "D", "A", "B", "D", "D", "A", "B", "D")
removeDup <- function(str) paste(rle(strsplit(str, "")[[1]])$values, collapse="") # Function to remove duplicated values in a string
data <- data.frame(x = vec1_num, y = vec2_num, col = match(vec3_char, LETTERS))
data <- data %>%
group_by(x) %>%
mutate(colour = glue::glue_collapse(col, sep = "")) %>%
select(-col) %>%
distinct(x, y, .keep_all = TRUE) %>%
mutate(colour = removeDup(colour)) %>%
mutate(colour = sapply(str_extract_all(colour, '\\d'), function(x) sum(as.integer(x)))) %>%
mutate(colour = case_when(
colour == 1 ~ "A",
colour == 2 ~ "B",
colour == 3 ~ "AB",
colour == 4 ~ "D",
colour == 5 ~ "AD",
colour == 6 ~ "BD",
colour == 7 ~ "ABD"
))
# plot 1
ggplot(data)
geom_point(aes(x=x, y=y, colour = as_factor(colour)), alpha=0.4, size=4)
geom_text(aes(x = x, y = y, label = colour), vjust = 2)
scale_colour_manual(values=c("A"="darkblue", "B"="darkred", "AB"="orange", "D" = "green", "AD" = "black", "BD" = "orange", "ABD" = "purple"), name = "Colour")
theme(panel.grid = element_blank())
.

uj5u.com熱心網友回復:
我會首先創建一個資料框。然后我會為每個 xy 組合 ( list(df$vec1_num, df$vec2_num))提取存在哪些字符 ( ...unique(xy_i$vec3_char)...)。像這樣:
df <- data.frame(vec1_num, vec2_num, vec3_char)
df_new <- do.call("rbind.data.frame", by(df, list(df$vec1_num, df$vec2_num), function(xy_i){
chars_i <- paste0(sort(unique(xy_i$vec3_char)),collapse= "")
xy_i$chars_comb <- factor(chars_i, levels= c("A", "AB", "AC", "ABC", "B", "BC", "C"))
xy_i
}))
如果您現在制作繪圖,它會顯示哪些字符在哪個點重疊。
ggplot(data = df_new)
geom_point(aes(x=vec1_num, y=vec2_num, colour=chars_comb), alpha=0.4, size=4)
scale_colour_manual(values=c("AB" = "black", "ABC" = "grey", "B" = "darkred", "C"="orange", "AC"= "red"))
theme(panel.grid = element_blank())
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/364228.html
下一篇:修復填充geom_hex
