3D卷積自編碼器沒有回傳正確的輸出形狀-有解無憂

我正在嘗試對時空資料使用自動編碼器。我的資料形狀是：batches , filters, timesteps, rows, columns. 我在將自動編碼器設定為正確的形狀時遇到問題。

這是我的模型：

input_imag = Input(shape=(3, 81, 4, 4))

x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(input_imag)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
encoded = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same', name='encoder')(x)

x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(encoded)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
decoded = Conv3D(3, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)

autoencoder = Model(input_imag, decoded)
autoencoder.compile(optimizer='adam', loss='mse')

autoencoder.summary()

這是摘要：

Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 3, 81, 4, 4)]     0
_________________________________________________________________
conv3d (Conv3D)              (None, 16, 81, 4, 4)      2176
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 16, 27, 2, 2)      0
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 8, 27, 2, 2)       5768
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 8, 9, 1, 1)        0
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 4, 9, 1, 1)        1444
_________________________________________________________________
encoder (MaxPooling3D)       (None, 4, 3, 1, 1)        0
_________________________________________________________________
conv3d_3 (Conv3D)            (None, 4, 3, 1, 1)        724
_________________________________________________________________
up_sampling3d (UpSampling3D) (None, 4, 9, 2, 2)        0
_________________________________________________________________
conv3d_4 (Conv3D)            (None, 8, 9, 2, 2)        1448
_________________________________________________________________
up_sampling3d_1 (UpSampling3 (None, 8, 27, 4, 4)       0
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 16, 27, 4, 4)      5776
_________________________________________________________________
up_sampling3d_2 (UpSampling3 (None, 16, 81, 8, 8)      0
_________________________________________________________________
conv3d_6 (Conv3D)            (None, 3, 81, 8, 8)       2163
=================================================================
Total params: 19,499
Trainable params: 19,499
Non-trainable params: 0

我應該改變什么以使解碼器輸出形狀 [?,3,81,4,4]不是[?,3,81,8,8]？

uj5u.com熱心網友回復：

看起來您希望 MaxPooling3D 和 UpSampling3D 操作是對稱的（至少在輸出形狀方面）。讓我們看看最后一個 MaxPooling3D 層的輸入形狀：

conv3d_2 (Conv3D)            (None, 4, 9, 1, 1)        1444
_________________________________________________________________
encoder (MaxPooling3D)       (None, 4, 3, 1, 1)        0

形狀是(None, 4, 9, 1, 1)。最后兩個維度已經是 1，所以它們不能被 2 整除，如pool_size. 所以 MaxPooling3D 層，盡管有一個pool_size=(3, 2, 2)，但有效地使用pool_size=(3, 1, 1). 至少我認為這就是幕后發生的事情。

我有點驚訝在指定 pool_size 大于輸入大小時沒有錯誤或警告。

要解決此問題，您可以將第一個 UpSampling3D 圖層的形狀設定為 (3, 1, 1)

x = UpSampling3D((3, 1, 1), data_format='channels_first')(x)

所以，完整的解決方案：

input_imag = Input(shape=(3, 81, 4, 4))

x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(input_imag)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same')(x)
x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
encoded = MaxPooling3D((3, 2, 2), data_format='channels_first', padding='same', name='encoder')(x)

x = Conv3D(4, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(encoded)
x = UpSampling3D((3, 1, 1), data_format='channels_first')(x)
x = Conv3D(8, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
x = Conv3D(16, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)
x = UpSampling3D((3, 2, 2), data_format='channels_first')(x)
decoded = Conv3D(3, (5, 3, 3), data_format='channels_first', activation='relu', padding='same')(x)

autoencoder = Model(input_imag, decoded)
autoencoder.compile(optimizer='adam', loss='mse')

autoencoder.summary()

輸出：

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_3 (InputLayer)        [(None, 3, 81, 4, 4)]     0         
                                                                 
 conv3d_14 (Conv3D)          (None, 16, 81, 4, 4)      2176      
                                                                 
 max_pooling3d_4 (MaxPooling  (None, 16, 27, 2, 2)     0         
 3D)                                                             
                                                                 
 conv3d_15 (Conv3D)          (None, 8, 27, 2, 2)       5768      
                                                                 
 max_pooling3d_5 (MaxPooling  (None, 8, 9, 1, 1)       0         
 3D)                                                             
                                                                 
 conv3d_16 (Conv3D)          (None, 4, 9, 1, 1)        1444      
                                                                 
 encoder (MaxPooling3D)      (None, 4, 3, 1, 1)        0         
                                                                 
 conv3d_17 (Conv3D)          (None, 4, 3, 1, 1)        724       
                                                                 
 up_sampling3d_6 (UpSampling  (None, 4, 9, 1, 1)       0         
 3D)                                                             
                                                                 
 conv3d_18 (Conv3D)          (None, 8, 9, 1, 1)        1448      
                                                                 
 up_sampling3d_7 (UpSampling  (None, 8, 27, 2, 2)      0         
 3D)                                                             
                                                                 
 conv3d_19 (Conv3D)          (None, 16, 27, 2, 2)      5776      
                                                                 
 up_sampling3d_8 (UpSampling  (None, 16, 81, 4, 4)     0         
 3D)                                                             
                                                                 
 conv3d_20 (Conv3D)          (None, 3, 81, 4, 4)       2163      
                                                                 
=================================================================
Total params: 19,499
Trainable params: 19,499
Non-trainable params: 0

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/385048.html

標籤：Python 张量流时间序列卷积神经网络自编码器

上一篇：到物件python的距離

下一篇：在Keras中添加vsConcatenate層