我將 ggplot geom_vline 與自定義函式結合使用,在直方圖上繪制某些值。
下面的示例函式例如回傳三個值的向量(平均值和低于或高于平均值的 x sds)。我現在可以在 geom_vline(xintercept) 中繪制這些值并在我的圖表中查看它們。
#example function
sds_around_the_mean <- function(x, multiplier = 1) {
mean <- mean(x, na.rm = TRUE)
sd <- sd(x, na.rm = TRUE)
tibble(low = mean - multiplier * sd,
mean = mean,
high = mean multiplier * sd) %>%
pivot_longer(cols = everything()) %>%
pull(value)
}
可重現的資料
#data
set.seed(123)
normal <- tibble(data = rnorm(1000, mean = 100, sd = 5))
outliers <- tibble(data = runif(5, min = 150, max = 200))
df <- bind_rows(lst(normal, outliers), .id = "type")
df %>%
ggplot(aes(x = data))
geom_histogram(bins = 100)
geom_vline(xintercept = sds_around_the_mean(df$data, multiplier = 3),
linetype = "dashed", color = "red")
geom_vline(xintercept = sds_around_the_mean(df$data, multiplier = 2),
linetype = "dashed")
問題是,如您所見,我必須在不同的地方定義 data$df 。當我對通過管道傳輸到 ggplot 的原始 df 應用任何更改時,這變得更容易出錯,例如在繪圖之前過濾掉例外值。我將不得不在多個地方再次應用相同的更改。
E.g.
df %>% filter(type == "normal")
#also requires
df$data
#to be changed to
df$data[df$type == "normal"]
#in geom_vline to obtain the correct input values for the xintercept.
因此,相反,我如何將 df$data 引數替換為首先通過管道傳輸到 ggplot() 的相應列?類似于“。”的東西。運營商,我假設。我也嘗試使用 geom = "vline" 的 stat_summary 來實作這一點,但沒有達到預期的效果。
uj5u.com熱心網友回復:
您可以將 ggplot 部分括在大括號中,并.
在 ggplot 命令和計算 sds_around_the_mean 時使用符號參考傳入資料集。這將使它充滿活力。
df %>%
{ggplot(data = ., aes(x = data))
geom_histogram(bins = 100)
geom_vline(xintercept = sds_around_the_mean(.$data, multiplier = 3),
linetype = "dashed", color = "red")
geom_vline(xintercept = sds_around_the_mean(.$data, multiplier = 2),
linetype = "dashed")}
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/496596.html
下一篇:資料框和日期時間