vgg16層的輸出沒有意義-有解無憂

我有一個沒有最后一個最大池、完全連接和 softmax 層的 vgg16 網路。網路摘要顯示最后一層的輸出大小為(batchsize, 512, 14, 14). 將影像放入網路會給我一個輸出(batchsize, 512, 15, 15)。我該如何解決？

import torch
import torch.nn as nn
from torchsummary import summary

vgg16 = torch.hub.load('pytorch/vision:v0.10.0', 'vgg16', pretrained=True)
vgg16withoutLastFewLayers = nn.Sequential(*list(vgg16.children())[:-2][0][0:30]).cuda()

image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)

summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
================================================================
torch.Size([1, 512, 15, 15])

uj5u.com熱心網友回復：

[512, 14, 14]假設輸入影像是，輸出形狀應該是[3, 224, 224]。您的輸入影像大小為[3, 244, 244]. 例如，

image = torch.zeros((1,3,224,224))
# torch.Size([1, 512, 14, 14])
output = vgg16withoutLastFewLayers(image)

因此，通過增加影像大小，[W, H]輸出張量的空間大小也會增加。

uj5u.com熱心網友回復：

您輸入的形狀大小不一樣...

image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)

summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)

差異：244 與 224。

因為那些 VGG 層只是卷積層，所以當你增加輸入影像的大小時，輸出的大小也會增加。如果在此之上直接應用分類頭（沒有全域池等），這將導致問題，因為它們具有固定大小的輸入。您沒有這樣做，但請記住這一點。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/422915.html

標籤：

上一篇：如果您有多個神經網路，PyTorch如何知道訓練損失應傳播回哪個神經網路？

下一篇：OnSelect的對面？