我有一個由 reddit 帖子組成的資料集,其中每一行顯示帖子、日期、基于帖子內容預測的 ML 情緒,以及給定帖子是否針對特定政治家。
這是一個資料示例:
post date mood directed_to_whom
Cartman 2012-09-03. negative Romney
Cartman 2012-09-06. negative Romney
Cartman 2012-09-13. negative Romney
Cartman 2012-09-15. neutral Bush
Mackey 2012-09-03. negative Bush
Mackey 2012-09-08. neutral Bush
Mackey 2012-09-13. neutral post
Garrison 2012-09-03. negative Romney
Garrison 2012-09-04. negative pre
Garrison 2012-09-04. negative pre
Garrison 2012-09-05. negative Obama
我創建了一個圖表,顯示了在整個時間段內負面、中性和正面帖子的月度份額,如下所示。但是,我有興趣創建一個變數來衡量針對奧巴馬的負面帖子的數量/份額,或者針對羅姆尼的正面帖子的數量/份額,但我不確定這是否可能?
ggplot(both_group, aes(x = as.Date(month_year), fill = sentiment ,y = sentiment_percentage))
geom_bar(stat = "identity", position=position_dodge())
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
xlab("Sentiment")
theme(plot.title = element_text(size = 18, face = "bold"))
scale_y_continuous (name = "Sentiment share")
theme_classic()
theme(plot.title = element_text(size = 5, face = "bold"),
axis.text.x = element_text(angle = 90, vjust = 0.5))
這是輸出:

uj5u.com熱心網友回復:
這樣的事情怎么辦。PS。我編輯了您的資料以使示例中的情節更有趣。
library(tidyverse)
dat |>
mutate(month = lubridate::ymd(date) |>
lubridate::month()) |>
count(month, mood, directed_to_whom)|>
group_by(month, directed_to_whom) |>
mutate(freq = n/sum(n)) |>
filter((mood == "negative" & directed_to_whom == "Obama") |
(mood == "positive" & directed_to_whom == "Romney")) |>
unite(grp, mood, directed_to_whom, sep = " toward " ) |>
ggplot(aes(month, freq, color = grp))
geom_point()
geom_line()

示例資料:
dat <- read_table("post date mood directed_to_whom
Cartman 2012-09-03. negative Romney
Cartman 2012-09-06. positive Romney
Cartman 2012-09-13. negative Romney
Cartman 2012-09-15. neutral Bush
Mackey 2012-09-03. negative Obama
Mackey 2012-09-08. neutral Obama
Mackey 2012-09-13. neutral Obama
Garrison 2012-09-03. positive Romney
Garrison 2012-09-04. negative Obama
Garrison 2012-09-04. negative Obama
Garrison 2012-1010-04. negative Obama
Garrison 2012-10-04. positive Obama
Garrison 2012-09-04. positive Obama
Garrison 2012-09-04. negative Obama
Garrison 2012-11-04. negative Obama
Cartman 2012-09-06. positive Romney
Cartman 2012-10-06. positive Romney
Cartman 2012-10-06. positive Romney
Cartman 2012-10-06. neutral Romney
Cartman 2012-11-06. negative Romney
Cartman 2012-12-06. positive Romney
Garrison 2012-11-04. positive Obama
Garrison 2012-12-05. negative Obama")
uj5u.com熱心網友回復:
像這樣的東西也可能有用:
library(tidyverse)
dat <- read_table("post date mood directed_to_whom
Cartman 2012-09-03. negative Romney
Cartman 2012-09-06. positive Romney
Cartman 2012-09-13. negative Romney
Cartman 2012-09-15. neutral Bush
Mackey 2012-09-03. negative Obama
Mackey 2012-09-08. neutral Obama
Mackey 2012-09-13. neutral Obama
Garrison 2012-09-03. positive Romney
Garrison 2012-09-04. negative Obama
Garrison 2012-09-04. negative Obama
Garrison 2012-1010-04. negative Obama
Garrison 2012-10-04. positive Obama
Garrison 2012-09-04. positive Obama
Garrison 2012-09-04. negative Obama
Garrison 2012-11-04. negative Obama
Cartman 2012-09-06. positive Romney
Cartman 2012-10-06. positive Romney
Cartman 2012-10-06. positive Romney
Cartman 2012-10-06. neutral Romney
Cartman 2012-11-06. negative Romney
Cartman 2012-12-06. positive Romney
Garrison 2012-11-04. positive Obama
Garrison 2012-12-05. negative Obama")
data_new <- dat %>%
mutate(month_year = substr(date, 1, 7)) %>%
group_by(month_year, mood, directed_to_whom)
ggplot(data_new, mapping = aes(x = month_year, y = directed_to_whom, color = mood)) geom_jitter() facet_wrap(~post)

轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/523469.html
上一篇:圖例的固定位置
