試圖用兩條線制作一個ggplot-有解無憂

資料可在以下網址找到：https ://www.kaggle.com/tovarischsukhov/southparklines

SP = read.csv("/Users/michael/Desktop/stat 479 proj data/All-seasons.csv")
SP$Season = as.numeric(SP$Season)
SP$Episode = as.numeric(SP$Episode)

Clean.Boys = SP  %>% select(Season, Episode, Character) %>% 
  arrange(Season, Episode, Character) %>% 
  filter(Character == "Kenny"   | Character == "Cartman") %>% 
  group_by(Season, Episode) 

count = table(Clean.Boys)
count = as.data.frame(count)
Clean = count %>% pivot_wider(names_from = Character, values_from = Freq) %>% group_by(Episode)

Season Episode Cartman Kenny
  <fct>  <fct>     <int> <int>
1 1      1            85     5
2 2      1             1     0
3 3      1            43    19
4 4      1            83     6
5 5      1            37     3
6 6      1            67     0

我正在嘗試使用 ggplot 制作一個帶有 2 條線的圖，一條用于 Cartman 變數，一條用于 Kenny 變數。我的兩個問題是

我的資料格式是否正確以使用 geom_line() 繪制圖？還是我必須將其旋轉更長的時間？
我想將 X 比例繪制為連續變數，類似于日期，但它是季節和劇集。例如，第一個繪圖點是第 1 季第 1 集，然后是第 1 季第 2 集，依此類推。我被困在如何將季節和劇集放在不同的列中來做到這一點，即使我將它們結合起來，我也不確定正確的格式是什么。

uj5u.com熱心網友回復：

在此示例中，我習慣于readr::read_csv讀取檔案并在呼叫中設定變數型別，以將其保存在單獨的代碼行中。

頻率計數可以dplyr::summarise在管道作業流中使用 , 完成。

我不確定您希望將季節和劇集資料保留為連續變數的真正含義 - 您必須更明確地說明您希望它的外觀。我采用的方法是提供一種使用最少文本顯示季節和劇集的方法：默認情況下，季節和劇集的順序是數字順序，但是當組合成一個字符時，它們必須通過使用強制轉換為數字順序factor。另一種選擇可能是按季節分面。

ggplot 喜歡長格式的資料，所以不需要將資料轉換成寬格式。

為了保持圖表的可讀性，僅顯示前 80 個觀察值。

library(readr)
library(dplyr)
library(ggplot2

SP <- read_csv("...your file path.../All-seasons.csv"col_types = "nncc")

Clean.Boys <- 
  SP  %>% 
  select(-Line) %>% 
  arrange(Season, Episode, Character) %>% 
  filter(Character == "Kenny"  | Character == "Cartman") %>% 
  group_by(Season, Episode, Character)%>% 
  summarise(count = n(), .groups = "keep") %>%
  mutate(x_lab = factor(paste(Season, Episode, sep = "\n"))) %>% 
  head(n = 80)

ggplot(Clean.Boys) 
  geom_line(aes(x_lab, count, group = Character, colour = Character)) 
  labs(x = "Season and episode")

試圖用兩條線制作一個ggplot

^{由reprex 包于 2022-02-20 創建(v2.0.1)}

uj5u.com熱心網友回復：

訣竅是將要映射的列收集為變數。我不知道，你想如何繪制你的圖表，意思是關于 x 軸和 y 軸，我做了一個偽圖。對于您的連續變數部分，您可以使用as.integer()or將您的值轉換為整數或數字as.numeric()，然后您可以將其用作連續刻度。你可以通過呼叫來檢查你的變數結構str(df)，它將顯示你的變數的類，如果它是因子或字符，將它們轉換為數字。

#libraries
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.0.5
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
#> Warning: package 'tidyr' was built under R version 4.0.3

#your code
SP <- read.csv("C:/Users/saura/Desktop/All-seasons.csv")
SP$Season = as.numeric(SP$Season)
#> Warning: NAs introduced by coercion
SP$Episode = as.numeric(SP$Episode)
#> Warning: NAs introduced by coercion

Clean.Boys = SP  %>% select(Season, Episode, Character) %>% 
  arrange(Season, Episode, Character) %>% 
  filter(Character == "Kenny"   | Character == "Cartman") %>% 
  group_by(Season, Episode) 
count = table(Clean.Boys)
count = as.data.frame(count)
Clean = count %>% pivot_wider(names_from = Character, values_from = Freq) %>% group_by(Episode)

#here is your code, but as I dont know, what you want on your axis
new_df <- Clean %>%
  gather(-Season,-Episode, key = "Views", value = "numbers")

ggplot(data = new_df, aes(
  as.numeric(Episode),
  numbers,
  color = Views,
  group = Views
))  
  geom_path()

試圖用兩條線制作一個ggplot

^{由reprex 包(v2.0.1)創建于 2022-02-19}

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/429482.html

標籤：r ggplot2 枢

上一篇：向地圖添加連續/漸變填充

下一篇：是否可以將多個單變數部分依賴圖的結果組合在一個圖上？