R中計算曲線下面積的問題-有解無憂

我有一個包含 50 個樣本的資料集，并將其分為訓練資料集和測驗資料集。我將 SVM 應用于訓練資料集并預測了一個模型。

在下面，您可以找到svm訓練資料中的Predicted列和測驗資料中的列。

data <- structure(list(Samples = c("Sample1", "Sample2", "Sample3", "Sample4", 
"Sample5", "Sample6", "Sample7", "Sample8", "Sample9", "Sample10", 
"Sample11", "Sample12", "Sample13", "Sample14", "Sample15", "Sample16", 
"Sample17", "Sample18", "Sample19", "Sample20", "Sample21", "Sample22", 
"Sample23", "Sample24", "Sample25", "Sample26", "Sample27", "Sample28", 
"Sample29", "Sample30", "Sample31", "Sample32", "Sample33", "Sample34", 
"Sample35", "Sample36", "Sample37", "Sample38", "Sample39", "Sample40", 
"Sample41", "Sample42", "Sample43", "Sample44", "Sample45", "Sample46", 
"Sample47", "Sample48", "Sample49"), svm = c("typeA", "typeA", 
"typeA", "typeB", "typeB", "typeB", "typeB", "typeB", "typeA", 
"typeB", "typeA", "typeB", "typeA", "typeB", "typeA", "typeB", 
"typeB", "typeB", "typeA", "typeA", "typeB", "typeA", "typeB", 
"typeA", "typeB", "typeA", "typeA", "typeA", "typeA", "typeA", 
"typeA", "typeB", "typeB", "typeB", "typeB", "typeB", "typeB", 
"typeB", "typeA", "typeB", "typeA", "typeB", "typeB", "typeA", 
"typeA", "typeA", "typeA", "typeA", "typeB"), Predicted = c("typeA", 
"typeA", "typeA", "typeB", "typeB", "typeB", "typeB", "typeB", 
"typeA", "typeB", "typeA", "typeA", "typeA", "typeB", "typeA", 
"typeB", "typeB", "typeB", "typeA", "typeA", "typeB", "typeA", 
"typeB", "typeA", "typeB", "typeA", "typeA", "typeA", "typeA", 
"typeA", "typeA", "typeB", "typeB", "typeB", "typeB", "typeA", 
"typeB", "typeB", "typeA", "typeA", "typeB", "typeB", "typeB", 
"typeA", "typeA", "typeA", "typeA", "typeA", "typeB")), row.names = c(NA, 
-49L), class = "data.frame")

我pred2通過如下操作添加了列：

data$pred2 <- ifelse(data$svm=="typeA", 1, 0)

我使用pROC包來獲取AUC.

library(pROC)
res.roc <- roc(data$Predicted, data$pred2)
plot.roc(res.roc, print.auc = TRUE, main="")

R中計算曲線下面積的問題

我看過好幾篇帖子，其中說明 AUC（曲線下面積）比 Accuracy 更能說明模型的性能。

我很困惑我計算 AUC 的方式是真正的 AUC 還是準確度？誰能判斷這是否正確？這足以檢查模型的性能嗎？

uj5u.com熱心網友回復：

我認為這個問題最好向Cross Validated提出，但準確度 != AUC。

這是一篇文章，描述了評估機器學習演算法性能的差異和其他一些可能更好的指標：https : //neptune.ai/blog/f1-score-accuracy-roc-auc-pr-auc

缺點是準確性需要選擇一個截止值，而 AUC 則不需要。

pROC 包使用trapezoid rule來計算 AUC。檢查該pROCH::auc函式的幫助，它有很多資訊和參考。

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/389590.html

標籤：r 分类支持向量机鹏

上一篇：RMarkdown使用注釋使部分可選

下一篇：如何計算特定單元格的總和？