我正在嘗試用 ggplot 和一個真實的資料集繪制一條二元回歸線以供練習。問題:以公里為單位的距離是否是選擇汽車作為前往足球場的交通工具的預測因素。
變數 A2 被二分化(1 = Auto(汽車)和 0 = kein Auto(無汽車)),現在稱為 A2_auto
dataset %>%
mutate(A2_auto = car::recode(.$A2,
"1 = 1; 2:9 = 0",
as.factor = FALSE)) -> dataset
dataset$A2_auto <- factor(dataset$A2_auto, labels = c("kein Auto",
"Auto"))
在我計算了決定系數(顯著但非常低的奇數比)之后,我想用 ggplot 繪制回歸曲線:
ggplot(data=dataset, aes(x=A21, y=A2_auto))
geom_point(alpha=.5)
stat_smooth(method="glm.fit", se=FALSE, method.args = list(family=binomial))
但我收到一條警告訊息:
>`geom_smooth()` using formula 'y ~ x'
Warnmeldung:
Computation failed in `stat_smooth()`:
Unused Argument (data = data)
散點圖中沒有回歸線。想不通為什么:

這是資料框的結構:
'data.frame': 689 obs. of 3 variables:
$ A2 : dbl lbl [1:689] 1, 1, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 3, 6, 6, 6, 6, 6, 6, 6...
..@ label : chr "Mit welchem Verkehrsmittel legen Sie die gr??te Distanz zum Stadion zurück, wenn Sie ein Bundesliga-Heimspiel b"| __truncated__
..@ format.spss : chr "F40.0"
..@ display_width: int 0
..@ labels : Named num 1 2 3 4 5 6 7 8 9
.. ..- attr(*, "names")= chr [1:9] "PKW" "Bahn (Fernverkehr)" "Bahn (Nahverkehr)" "Fernbus" ...
$ A21 : num 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "label")= chr "Distanz in km"
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "display_width")= int 0
$ A2_auto: Factor w/ 2 levels "kein Auto","Auto": 2 2 1 1 1 1 1 1 1 1 ...
謝謝您的幫助!
Edit1: 這是 dput(head(dataset,50)) 的輸出:
structure(list(A2 = structure(c(1, 1, 6, 6, 6, 7, 7, 7, 7, 7,
7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 6, 6, 6, 6, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 3, 6, 6, 6, 6, 6, 6), label = "Mit welchem Verkehrsmittel legen Sie die gr??te Distanz zum Stadion zurück, wenn Sie ein Bundesliga-Heimspiel besuchen? - Selected Choice", format.spss = "F40.0", display_width = 0L, labels = c(PKW = 1,
`Bahn (Fernverkehr)` = 2, `Bahn (Nahverkehr)` = 3, Fernbus = 4,
`Fan-/Reisebus` = 5, ?PNV = 6, Fahrrad = 7, `Zu Fu?` = 8, Sonstige = 9
), class = c("haven_labelled", "vctrs_vctr", "double")), A21 = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4,
6, 6, 6, 6, 6, 6, 6), A2_auto = structure(c(2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("kein Auto",
"Auto"), class = "factor")), row.names = c(NA, 50L), class = "data.frame")
When I change the model from glm.fit to glm another Warning Message is coming up:
ggplot(data=dataset, aes(x=A21, y=A2_auto))
geom_point(alpha=.5)
stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial))
Output:
`geom_smooth()` using formula 'y ~ x'
Warnmeldungen:
1: glm.fit: algorithm did not converge
2: Computation failed in `stat_smooth()`:
y values must be 0 <= y <= 1
I also dichotomized the variable to 0 and 1 (without factors) and the same error occurs:
dataset %>%
mutate(A2_auto = car::recode(.$A2,
"1 = 1; 2:9 = 0",
as.factor = TRUE)) -> dataset
`geom_smooth()` using formula 'y ~ x'
Warnmeldungen:
1: glm.fit: algorithm did not converge
2: Computation failed in `stat_smooth()`:
y values must be 0 <= y <= 1

I`ll try to make my example reproducible with mtcars as suggested in the comments.
uj5u.com熱心網友回復:
我想我已經找到了解決辦法。與資料集mtcars對比后,我仔細查看了資料集中的變數A2_auto,發現該變數畢竟不是數字。所以我再次轉換它并將其二分。此外,“glm”是評論中描述的正確方法。再次感謝評論中的建議!它現在起作用了。
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/335804.html
標籤:r ggplot2 logistic-regression geom
下一篇:ifelse按行排序值
