我正在嘗試使用 ggplot 制作直方圖,其中超過 95% 的資料為 0,其余資料在 1 - 55 之間。我不想在直方圖上顯示 0 - 但我確實希望它們占總百分比,這樣其他 %s 仍然很低。我為此采取了兩種方法——但發生的情況是其余資料的百分比被弄亂了,計算中不包括 0。
我的第一種方法是:
set1 %>% filter(total>0)%>%
ggplot(aes(x=total, fill=lowcost))
geom_histogram(binwidth=1,aes(y = (..count..)/sum(..count..)),col=I("black"))
scale_color_grey() scale_fill_grey(start = .85,
end = .85,)
theme_linedraw()
guides(fill = "none", cols='none')
geom_vline(aes(xintercept=10, size='Low target'),
color="black", linetype=5)
geom_vline(aes(xintercept=50, size='High target'),
color="black", linetype="dotted")
scale_size_manual(values = c(.5, 0.5), guide=guide_legend(title = "Target", override.aes = list(linetype=c(3,5), color=c('black', 'black'))))
scale_y_continuous(labels=scales::percent)
scale_x_continuous(breaks = c(seq(0,50,10), 55), labels = c(seq(0, 50, 10), '>55'), limits = c(0, 60))
facet_grid(cols = vars(lowcost))
ggtitle("Ask Set 1 ")
theme(plot.title = element_text(hjust = 0.5))
xlab("Total donation ($)")
ylab("Percent")
我的第二種方法不是過濾掉 0,而是限制 X 軸不包括它們,但這也不起作用:
set1 %>%
ggplot(aes(x=total, fill=lowcost))
geom_histogram(binwidth=1,aes(y = (..count..)/sum(..count..)),col=I("black"))
scale_color_grey() scale_fill_grey(start = .85,
end = .85,)
theme_linedraw()
guides(fill = "none", cols='none')
geom_vline(aes(xintercept=10, size='Low target'),
color="black", linetype=5)
geom_vline(aes(xintercept=50, size='High target'),
color="black", linetype="dotted")
scale_size_manual(values = c(.5, 0.5), guide=guide_legend(title = "Target", override.aes = list(linetype=c(3,5), color=c('black', 'black'))))
scale_y_continuous(labels=scales::percent)
scale_x_continuous(breaks = c(seq(0,50,10), 55), labels = c(seq(0, 50, 10), '>55'), limits = c(0.01, 60))
facet_grid(cols = vars(lowcost))
ggtitle("Ask Set 1 ")
theme(plot.title = element_text(hjust = 0.5))
xlab("Total donation ($)")
ylab("Percent")
兩者都導致直方圖如下所示:左側直方圖上最高的條實際上應該是 1.19%

左側直方圖中的百分比應如下所示:

右側直方圖中的百分比應如下所示:

uj5u.com熱心網友回復:
我認為你可以使用“剪輯”來做你想做的事coord_cartesian。試試這個(未經測驗):
set1 %>%
# filter(total>0) %>% # comment this out, do not filter
ggplot(aes(x=total, fill=lowcost))
coord_cartesian(xlim = c(1, NA)) # start at 1, extend to the normal limit
geom_histogram(binwidth=1, aes(y = (..count..)/sum(..count..)), col=I("black"))
... # rest unchanged
uj5u.com熱心網友回復:
也許嘗試這樣的事情:
# Test data expected outcome
set1 <- tibble(total=c(rep(0,10), rep(1,5), rep(2,5)))
set1 %>% count(total) %>% mutate(percent = n/sum(n))

# First, count the percentage and store it in a temporary variable
# Then, use the percentage variable with "identity" option for the histogram
# You can then either filter out the total first, or change the limit
set1 %>%
count(total) %>%
mutate(percent = n/sum(n)) %>%
filter(total>0) %>%
ggplot(aes(x=total,y=percent))
geom_histogram(stat="identity")
scale_x_continuous(limits = c(0, 3))
scale_y_continuous(labels=scales::percent)
ylab("Percent")

轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/450739.html
