我有一個沒有最后一個最大池、完全連接和 softmax 層的 vgg16 網路。網路摘要顯示最后一層的輸出大小為(batchsize, 512, 14, 14). 將影像放入網路會給我一個輸出(batchsize, 512, 15, 15)。我該如何解決?
import torch
import torch.nn as nn
from torchsummary import summary
vgg16 = torch.hub.load('pytorch/vision:v0.10.0', 'vgg16', pretrained=True)
vgg16withoutLastFewLayers = nn.Sequential(*list(vgg16.children())[:-2][0][0:30]).cuda()
image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)
summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 224, 224] 1,792
ReLU-2 [-1, 64, 224, 224] 0
Conv2d-3 [-1, 64, 224, 224] 36,928
ReLU-4 [-1, 64, 224, 224] 0
MaxPool2d-5 [-1, 64, 112, 112] 0
Conv2d-6 [-1, 128, 112, 112] 73,856
ReLU-7 [-1, 128, 112, 112] 0
Conv2d-8 [-1, 128, 112, 112] 147,584
ReLU-9 [-1, 128, 112, 112] 0
MaxPool2d-10 [-1, 128, 56, 56] 0
Conv2d-11 [-1, 256, 56, 56] 295,168
ReLU-12 [-1, 256, 56, 56] 0
Conv2d-13 [-1, 256, 56, 56] 590,080
ReLU-14 [-1, 256, 56, 56] 0
Conv2d-15 [-1, 256, 56, 56] 590,080
ReLU-16 [-1, 256, 56, 56] 0
MaxPool2d-17 [-1, 256, 28, 28] 0
Conv2d-18 [-1, 512, 28, 28] 1,180,160
ReLU-19 [-1, 512, 28, 28] 0
Conv2d-20 [-1, 512, 28, 28] 2,359,808
ReLU-21 [-1, 512, 28, 28] 0
Conv2d-22 [-1, 512, 28, 28] 2,359,808
ReLU-23 [-1, 512, 28, 28] 0
MaxPool2d-24 [-1, 512, 14, 14] 0
Conv2d-25 [-1, 512, 14, 14] 2,359,808
ReLU-26 [-1, 512, 14, 14] 0
Conv2d-27 [-1, 512, 14, 14] 2,359,808
ReLU-28 [-1, 512, 14, 14] 0
Conv2d-29 [-1, 512, 14, 14] 2,359,808
ReLU-30 [-1, 512, 14, 14] 0
================================================================
torch.Size([1, 512, 15, 15])
uj5u.com熱心網友回復:
[512, 14, 14]假設輸入影像是 ,輸出形狀應該是[3, 224, 224]。您的輸入影像大小為[3, 244, 244]. 例如,
image = torch.zeros((1,3,224,224))
# torch.Size([1, 512, 14, 14])
output = vgg16withoutLastFewLayers(image)
因此,通過增加影像大小,[W, H]輸出張量的空間大小也會增加。
uj5u.com熱心網友回復:
您輸入的形狀大小不一樣...
image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)
summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)
差異:244 與 224。
因為那些 VGG 層只是卷積層,所以當你增加輸入影像的大小時,輸出的大小也會增加。如果在此之上直接應用分類頭(沒有全域池等),這將導致問題,因為它們具有固定大小的輸入。您沒有這樣做,但請記住這一點。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/422915.html
標籤:
上一篇:如果您有多個神經網路,PyTorch如何知道訓練損失應傳播回哪個神經網路?
下一篇:OnSelect的對面?
