有沒有辦法去除模型中的特定神經元?
例如,我有一個帶有 512 個神經元的 Dense 層的模型。有沒有辦法洗掉所有包含索引的神經元list_indeces?當然,移除一個神經元會影響下一層甚至前一層。
例子:
我在多篇論文中都有這個通用模型:
data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = functools.partial(
tf.keras.layers.MaxPooling2D,
pool_size=(2, 2),
padding='same',
data_format=data_format)
conv2d = functools.partial(
tf.keras.layers.Conv2D,
kernel_size=5,
padding='same',
data_format=data_format,
activation=tf.nn.relu)
model = tf.keras.models.Sequential([
conv2d(filters=32, input_shape=input_shape),
max_pool(),
conv2d(filters=64),
max_pool(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10 if only_digits else 62),
])
return model
假設tf.keras.layers.Dense(512, activation=tf.nn.relu)我想從層中移除 100 個神經元,基本上將它們關閉。
當然,我將有一個帶有該層的新模型tf.keras.layers.Dense(412, activation=tf.nn.relu)而不是tf.keras.layers.Dense(512, activation=tf.nn.relu)但是這種修改也應該傳播到下一層的權重,因為從密集層的神經元到下一層的連接也被洗掉了。
關于如何這樣做的任何意見?我可以通過執行以下操作來手動執行此操作:
如果我正確理解,模型形狀就是這個: [5, 5, 1, 32], [32], [5, 5, 32, 64], [64], [3136, 512], [512], [512, 62], [62]
所以我可以做這樣的事情:
- 生成我需要的所有索引并在里面相同
list_indices - 訪問層的權重
tf.keras.layers.Dense(512, activation=tf.nn.relu),并創建一個包含所有內部權重的張量list_indices - 將新的權重張量分配給
tf.keras.layers.Dense(412, activation=tf.nn.relu)子模型的層
問題是我不知道如何獲得下一層權重的正確權重,這些權重對應于我剛剛創建的權重索引以及我應該分配給子模型下一層的權重。我希望我已經清楚地解釋了自己。
謝謝,萊拉。
uj5u.com熱心網友回復:
您的操作在文獻中被稱為selective dropout,實際上不需要每次都創建不同的模型,您只需要將所選神經元的輸出乘以 0,這樣下一層的輸入就不會采用這些激活帳戶。
請注意,如果您“關閉”層中的神經元,Ln它不會完全“關閉”層中的任何神經元Ln 1,假設兩者都是完全連接的層(密集):Ln 1層中的每個神經元都連接到所有神經元在上一層。換句話說,移除全連接(密集)層中的神經元不會影響下一層的維度。
您可以使用Multiply Layer(Keras)簡單地實作此操作。缺點是您需要學習如何使用Keras 函式式 API。還有其他方法但比這更復雜(例如自定義層),而且函式式 API 在許多方面都非常有用和強大,非常建議閱讀!
你的模型會變成這樣:
data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = ...
conv2d = ...
# convert a list of indexes to a weight tensor
def make_index_weights(indexes):
# converting indexes to a list of weights
indexes = [ float(i not in indexes) for i in range(units) ]
# converting indexes from list/numpy to tensor
indexes = tf.convert_to_tensor(indexes)
# reshaping to the correct format
indexes = tf.reshape(indexes, (1, units))
# ensuring it is a float tensor
indexes = tf.cast(indexes, 'float32')
return indexes
# layer builder utility
def selective_dropout(units, indexes, **kwargs):
indexes = make_index_weights(indexes)
dense = tf.keras.layers.Dense(units, **kwargs)
mul = tf.keras.layers.Multiply()
# return the tensor builder
return lambda inputs: mul([ dense(inputs), indexes ])
input_layer = tf.keras.layers.Input(input_shape)
conv_1 = conv2d(filters=32, input_shape=input_shape)(input_layer)
maxp_1 = max_pool()(conv_1)
conv_2 = conv2d(filters=64)(maxp_1)
maxp_2 = max_pool()(conv_2)
flat = tf.keras.layers.Flatten()(maxp_2)
sel_drop_1 = selective_dropout(512, INDEXES, activation=tf.nn.relu)(flat)
dense_2 = tf.keras.layers.Dense(10 if only_digits else 62)(sel_drop_1)
output_layer = dense2
model = tf.keras.models.Model([ input_layer ], [ output_layer ])
return model
現在你只需要INDEXES根據你需要移除的那些神經元的索引來建立你的串列。
在您的情況下,張量的形狀為 ,1x512因為密集層中有 512 個權重(單位/神經元),因此您需要為索引提供盡可能多的權重。該selective_dropout函式允許傳遞要丟棄的索引串列,并自動建立所需的張量。
例如,如果您想移除神經元 1、10、12,您只需將串列傳遞[1, 10, 12]給函式,它將在這些位置以及所有其他位置生成一個1x512張量。0.01.0
編輯:
正如您所提到的,您嚴格需要減少模型中引數的大小。
Each dense layer is described by the relation y = Wx B, where W is the kernel (or weights matrix) and B is the bias vector. W is a matrix of INPUTxOUTPUT dimensions, where INPUT is the last layer output shape and OUTPUT is the number of neurons/units/weights in the layer; B is just a vector of dimension 1xOUTPUT (but we are not interested in this).
The problem now is that you are dropping N neurons in the layer Ln and this induce the drop of NxOUTPUT weights in the layer Ln 1. Let's be pratic with some numbers. In your case (supposing only_digits as true) you start with:
Nx512 -> 512x10 (5120 weights)
And after dropping 100 neurons (it means a drop of 100*10=1000 weights)
Nx412 -> 412x10 (4120 weights)
Now each column of the W matrix describe a neuron (as a vector of weights with a dimension equal to the previous layer output dimension, in our case 512 or 412). The rows of the matrix represent instead a single neuron in the previous layer.
The W[0,0] indicates the relation between the first neuron of layer n and the first of layer n 1.
W[0,0] -> 1st n, 1st n 1W[0,1] -> 2nd n, 1st n 1W[1,0] -> 1st n, 2nd n 1
And so on. So you could just remove from this matrix all the rows that are related to the neuron indexes you removed: index 0 -> row 0.
You can access the W matrix as a tensor from the dense layer with dense.kernel
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/350059.html
上一篇:根據另一個和列的索引生成新的張量
