如何創建一個有條件的PyTorch鉤子？-有解無憂

我正在學習鉤子并使用二值化神經網路。問題是有時我的梯度在向后傳遞中為 0。我正在嘗試用某個值替換這些漸變。

說我有以下網路

import torch
import torch.nn as nn
import torch.optim as optim

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.fc1 = nn.Linear(1, 2)
        self.fc2 = nn.Linear(2, 3)
        self.fc3 = nn.Linear(3, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = torch.relu(x)        
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Model()

opt = optim.Adam(net.parameters())

還有一些特點

features = torch.rand((3,1))

我可以使用以下方法正常訓練它：

for i in range(10):
    opt.zero_grad()
    out = net(features)
    loss = torch.mean(torch.square(torch.tensor(5) - torch.sum(out)))
    loss.backward()
    opt.step()

我如何附加一個鉤子函式，該函式將具有以下向后傳遞條件（對于每一層）：

如果單個圖層中的所有漸變都為 0，則將它們更改為 1.0。
如果其中一個梯度為 0，但至少有一個梯度不是 0，請將其更改為 0.5。

uj5u.com熱心網友回復：

你可以在你的nn.Modulewith上附加一個回呼函式nn.Module.register_full_backward_hook：

您將不得不處理這兩種情況：如果所有元素都等于零，則使用torch.all，否則（即至少有一個非零）如果至少一個元素等于零，則使用torch.any。

def grad_mod(module, grad_inputs, grad_outputs):
    if module.weight.grad is None: # safety measure for last layer 
        return None                # and layers w/ require_grad=False

    flat = module.weight.grad.view(-1)
    if torch.all(flat == 0):
        flat.data.fill_(1.)
    elif torch.any(flat == 0):
        flat.data.scatter_(0, (flat == 0).nonzero()[:,0], value=.5)

第一個子句中的指令將填充所有值，1.而第二個子句中的指令僅將零值替換為.5。

將掛鉤連接到nn.Module：

>>> net.fc3.register_full_backward_hook(grad_mod)

這里我用printmutating前后的陳述句flat來展示hook的效果：

>>> net(torch.rand((3,1))).backward(torch.tensor([[0],[1],[2]]))
>>> tensor([0.0947, 0.0000, 0.0000]) # before
>>> tensor([0.0947, 0.5000, 0.5000]) # after

>>> net(torch.rand((3,1))).backward(torch.tensor([[0],[1],[2]]))
>>> tensor([0., 0., 0.])             # before
>>> tensor([1., 1., 1.])             # after

為了將此鉤子應用于多個層，您可以包裝grad_mod和利用nn.Module.apply遞回行為：

>>> def apply_grad_mod(module):
...     if hasattr(module, 'weight'):
...         module.register_full_backward_hook(grad_mod)

然后下面將在所有層權重上應用鉤子。

>>> net.apply(apply_grad_mod)

注意：如果您還希望影響偏差，則必須擴展此行為！

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/347339.html

標籤：Python 机器学习火炬反向传播

上一篇：從音頻讀取中獲取光譜儀的不同背景顏色

下一篇：來自KFold拆分指數的實際資料