神經網路的成本函式計算-有解無憂

我在 Coursera 上 Andrew Ng 的機器學習課程的第 5 周。我正在完成本周在 Matlab 中的編程作業，我選擇使用 for 回圈實作來計算成本 J。這是我的函式。

function [J grad] = nnCostFunction(nn_params, ...
                                   input_layer_size, ...
                                   hidden_layer_size, ...
                                   num_labels, ...
                                   X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
%   [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
%   X, y, lambda) computes the cost and gradient of the neural network. The
%   parameters for the neural network are "unrolled" into the vector
%   nn_params and need to be converted back into the weight matrices. 

% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our 2 layer neural network

Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size   1)), ...
                 hidden_layer_size, (input_layer_size   1));

Theta2 = reshape(nn_params((1   (hidden_layer_size * (input_layer_size   1))):end), ...
                 num_labels, (hidden_layer_size   1));


% Setup some useful variables
m = size(X, 1);

% add bias to X to create 5000x401 matrix
X = [ones(m, 1) X];
         
% You need to return the following variables correctly 
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));


% initialize summing terms used in cost expression
sum_i = 0.0;

% loop through each sample to calculate the cost
for i = 1:m

    % logical vector output for 1 example
    y_i = zeros(num_labels, 1);
    class = y(m);
    y_i(class) = 1;
    
    % first layer just equals features in one example 1x401
    a1 = X(i, :);
    
    % compute z2, a 25x1 vector
    z2 = Theta1*a1';
    
    % compute activation of z2
    a2 = sigmoid(z2);
    
    % add bias to a2 to create a 26x1 vector
    a2 = [1; a2];
    
    % compute z3, a 10x1 vector
    z3 = Theta2*a2;
    
    %compute activation of z3. returns output vector of size 10x1
    a3 = sigmoid(z3);
    h = a3;
    
    % loop through each class k to sum cost over each class
    for k = 1:num_labels        
        
        % sum_i returns cost summed over each class
        sum_i = sum_i   ((-1*y_i(k) * log(h(k))) - ((1 - y_i(k)) * log(1 - h(k))));
        
    end
        
end

J = sum_i/m;

我知道這個的矢量化實作會更容易，但我不明白為什么這個實作是錯誤的。當 num_labels = 10 時，該函式輸出 J = 8.47，但預期成本為 0.287629。余計算?從這個公式。我誤解了計算嗎？我的理解是，計算 10 個類別中每個類別的每個訓練示例的成本，然后將每個示例的所有 10 個類別的成本加在一起。那不正確嗎？或者我沒有在我的代碼中正確實作它？提前致謝。

uj5u.com熱心網友回復：

問題出在您正在實施的公式中

這個運算式((-1*y_i(k) * log(h(k))) - ((1 - y_i(k)) * log(1 - h(k)))); 代表了二元分類中的損失，因為你只有 2 個類，所以要么

y_i is 0 so (1 - yi) = 1
y_i is 1 so (1 - yi) = 0

所以你基本上只考慮目標類別的概率。

如何在您提到 (y_i) 或 (1 - yi) 的 10 個標簽的情況下，其中一個不需要為 0，另一個為 1

您應該更正損失函式實作，以便您只考慮目標類的概率，而不考慮所有其他類。

uj5u.com熱心網友回復：

我的問題是索引。而不是說它class = y(m)應該是class = y(i)因為i是索引并且m是 5000 來自訓練資料中的行數。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/363972.html

標籤：MATLAB 机器学习神经网络分类

上一篇：復數中實部最小的數的索引

下一篇：引數空間中的solve_ivp問題