ML之sklearn：sklearn.metrics中常用的函式引數(比如confusion_matrix等 )解釋及其用法說明之詳細攻略

sklearn.metrics中常用的函式引數

confusion_matrix

推薦文章
ML：分類預測問題中評價指標(ER/混淆矩陣P-R-F1/ROC-AUC/RP/mAP)簡介、使用方法、代碼實作、案例應用之詳細攻略
CNN之性能指標：卷積神經網路中常用的性能指標(IOU/AP/mAP、混淆矩陣)簡介、使用方法之詳細攻略

sklearn.metrics中常用的函式引數

confusion_matrix函式解釋

回傳值：混淆矩陣，其第i行和第j列條目表示真實標簽為第i類、預測標簽為第j類的樣本數，

預測
0 1
真實 0
1

def confusion_matrix Found at: sklearn.metrics._classification

@_deprecate_positional_args
def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, normalize=None):
"""Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix :math:`C` is such that :math:`C_{i, j}` is equal to the number of observations known to be in group :math:`i` and predicted to be in group :math:`j`.

Thus in binary classification, the count of true negatives is
:math:`C_{0,0}`, false negatives is :math:`C_{1,0}`, true positives is
:math:`C_{1,1}` and false positives is :math:`C_{0,1}`.

Read more in the :ref:`User Guide <confusion_matrix>`.

Parameters
----------
y_true : array-like of shape (n_samples,) Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,) Estimated targets as returned by a classifier.
labels : array-like of shape (n_classes), default=None. List of labels to index the matrix. This may be used to reorder
or select a subset of labels. If ``None`` is given, those that appear at least once in ``y_true`` or ``y_pred`` are used in sorted order.

sample_weight : array-like of shape (n_samples,), default=None. Sample weights.

.. versionadded:: 0.18

normalize : {'true', 'pred', 'all'}, default=None. Normalizes confusion matrix over the true (rows), predicted (columns)
conditions or all the population. If None, confusion matrix will not be normalized.

Returns
-------
C : ndarray of shape (n_classes, n_classes)
Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and prediced label being j-th class.

References
----------
.. [1] `Wikipedia entry for the Confusion matrix <https://en.wikipedia.org/wiki/Confusion_matrix>`_ (Wikipedia and other references may use a different convention for axes)

在:sklear. metrics._classification找到的def confusion_matrix

@_deprecate_positional_args
defconfusion_matrix (y_true, y_pred， *， label =None, sample_weight=None， normalize= None):
計算混淆矩陣來評估分類的準確性，

根據定義，一個混淆矩陣:math: ' C '是這樣的:math: ' C_{i, j} '等于已知在:math: ' i '組和預測在:math: ' j '組的觀測數，

因此，在二元分類法中，true negatives的數量是
:math:`C_{0,0}`, false negatives is :math:`C_{1,0}`, true positives is
:math:`C_{1,1}` and false positives is :math:`C_{0,1}`.

更多資訊見:ref: ' User Guide <confusion_matrix> '，</confusion_matrix>

引數
----------
y_true:類陣列形狀(n_samples，) Ground truth (correct)目標值，
y_pred:分類器回傳的估計目標的類陣列形狀(n_samples，)，
標簽:類陣列形狀(n_classes)，默認=無，索引矩陣的標簽串列，這可以用于重新排序
或者選擇標簽的子集，如果給出了' ' None ' '，則在' ' y_true ' '或' ' y_pred ' '中至少出現一次的值將按排序順序使用，

sample_weight:類似陣列的形狀(n_samples，)，默認=None，樣本權重，

. .versionadded:: 0.18

{'true'， 'pred'， 'all'}， default=None，對真實(行)、預測(列)的混淆矩陣進行規范化
條件或所有的人口，如果沒有，混淆矩陣將不會被標準化，

回傳
-------
C:形狀的ndarray (n_classes, n_classes)
第i行和第j列項表示真標簽樣本個數為第i類，謂詞標簽樣本個數為第j類的混淆矩陣，
    
參考
----------
. .[1] '用于混淆矩陣的維基百科條目<https: en.wikipedia.org="" wiki="" confusion_matrix=""> ' _(維基百科和其他參考可能對軸使用不同的約定)</https:>

Examples
--------
>>> from sklearn.metrics import confusion_matrix
>>> y_true = [2, 0, 2, 2, 0, 1]
>>> y_pred = [0, 0, 2, 2, 0, 2]
>>> confusion_matrix(y_true, y_pred)
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])

>>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
>>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
>>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])

In the binary case, we can extract true positives, etc as follows:

>>> tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel()
>>> (tn, fp, fn, tp)
(0, 2, 1, 1)

"""
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
if y_type not in ("binary", "multiclass"):
raise ValueError("%s is not supported" % y_type)
if labels is None:
labels = unique_labels(y_true, y_pred)
else:
labels = np.asarray(labels)
n_labels = labels.size
if n_labels == 0:
raise ValueError("'labels' should contains at least one label.")
elif y_true.size == 0:
return np.zeros((n_labels, n_labels), dtype=np.int)
elif np.all([l not in y_true for l in labels]):
raise ValueError("At least one label specified must be in y_true")
if sample_weight is None:
sample_weight = np.ones(y_true.shape[0], dtype=np.int64)
else:
sample_weight = np.asarray(sample_weight)
check_consistent_length(y_true, y_pred, sample_weight)
if normalize not in ['true', 'pred', 'all', None]:
raise ValueError("normalize must be one of {'true', 'pred', "
"'all', None}")
n_labels = labels.size
label_to_ind = {y:x for x, y in enumerate(labels)}
# convert yt, yp into index
y_pred = np.array([label_to_ind.get(x, n_labels + 1) for x in y_pred])
y_true = np.array([label_to_ind.get(x, n_labels + 1) for x in y_true])
# intersect y_pred, y_true with labels, eliminate items not in labels
ind = np.logical_and(y_pred < n_labels, y_true < n_labels)
y_pred = y_pred[ind]
y_true = y_true[ind] # also eliminate weights of eliminated items
sample_weight = sample_weight[ind]
# Choose the accumulator dtype to always have high precision
if sample_weight.dtype.kind in {'i', 'u', 'b'}:
dtype = np.int64
else:
dtype = np.float64
cm = coo_matrix((sample_weight, (y_true, y_pred)), shape=(n_labels,
n_labels), dtype=dtype).toarray()
with np.errstate(all='ignore'):
if normalize == 'true':
cm = cm / cm.sum(axis=1, keepdims=True)
elif normalize == 'pred':
cm = cm / cm.sum(axis=0, keepdims=True)
elif normalize == 'all':
cm = cm / cm.sum()
cm = np.nan_to_num(cm)
return cm

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/128088.html

標籤：其他

上一篇：自從掌握了軟體開發的 5 條核心原則，我每天作業時至少可以多摸魚 4 個小時

下一篇：微軟與 OpenAI 達成合作，獲得 GPT-3 獨家使用授權！