張量流自動微分中的二階導數為None-有解無憂

在下面的代碼中，我正在計算一個具有線性激活函式的線性網路的二階導數y_xx_lin（modelLineary_xx_tanhmodelTanhtanh

我的問題是：y_xx_lin是None但y_xx_tanh顯示了一些價值。在這個 Stackoverflow 問題之后，我猜這y_xx_lin是None因為線性函式的二階導數對于所有輸入值都為零，因此在某種意義上與輸入無關。是這樣嗎？

即使是這樣，我希望 TensorFlow 計算導數并回傳它，而不是回傳None. 這可能嗎？

# Second derivative of a linear network appears to be None

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import MeanSquaredError
import tensorflow.keras.backend as K
import numpy as np
import matplotlib.pyplot as plt

def build_network(activation='linear'):
    input_layer  = Input(1)
    inner_layer  = Dense(6, activation=activation)(input_layer)
    inner_layer1 = Dense(6, activation=activation)(inner_layer)
    inner_layer2 = Dense(6, activation=activation)(inner_layer1)
    output_layer = Dense(1, activation='linear')(inner_layer2)
    model = Model(input_layer, output_layer)
    return model

def get_first_second_derivative(X_train,y_train,model):
    with tf.GradientTape(persistent=True) as tape_second:
        tape_second.watch(X_train)
        
        with tf.GradientTape(persistent=True) as tape_first:
            # Watch the variables with who/whom we want to compute gradients
            tape_first.watch(X_train)
    
            # get the output of the NN
            output = model(X_train)
    
        y_x  = tape_first.gradient(output,X_train)

    y_xx = tape_second.gradient(y_x,X_train)
    
    return y_x,y_xx

modelLinear = build_network(activation='linear')
modelLinear.compile(optimizer=Adam(learning_rate=0.1),loss='mse')

modelTanh = build_network(activation='tanh')
modelTanh.compile(optimizer=Adam(learning_rate=0.1),loss='mse')

X_train = np.linspace(-1,1,10).reshape((-1,1))
y_train = X_train*X_train

X_train = tf.convert_to_tensor(X_train,dtype=tf.float64)
y_train = tf.convert_to_tensor(y_train,dtype=tf.float64)

y_x_lin,y_xx_lin   = get_first_second_derivative(X_train,y_train,modelLinear)
y_x_tanh,y_xx_tanh = get_first_second_derivative(X_train,y_train,modelTanh)

print('Type of y_xx_lin = ',type(y_xx_lin))

uj5u.com熱心網友回復：

如果你設定lambda x: x ** 1而不是'linear'喜歡它會起作用

...

id_func = lambda x: x ** 1

def build_network(activation=id_func):
    input_layer  = Input(1)
    inner_layer  = Dense(6, activation=activation)(input_layer)
    inner_layer1 = Dense(6, activation=activation)(inner_layer)
    inner_layer2 = Dense(6, activation=activation)(inner_layer1)
    output_layer = Dense(1, activation=id_func)(inner_layer2)
    model = Model(input_layer, output_layer)
    return model

...

modelLinear = build_network(activation=id_func)

...

它起作用的原因以及您編碼失敗的原因在于您已經參考的答案。使用這種奇怪的身份函式實作，TensorFlow 反向傳播可以正常作業。

使用 TensorfFlow 2.9.2 版進行測驗。

uj5u.com熱心網友回復：

如果您想計算與輸入的偏差作為一個系列（我看到了您的問題和意圖，但您可以使用模型層作為代碼示例，您可以為了方便進行一些調整）

示例：<<它增長的速度有多快，它們的結果就有多快>>

import tensorflow as tf

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Class / Definition
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
class MyLSTMLayer( tf.keras.layers.LSTM ):
    def __init__(self, units, return_sequences, return_state):
        super(MyLSTMLayer, self).__init__( units, return_sequences=True, return_state=False )
        self.num_units = units

    def build(self, input_shape):
        self.kernel = self.add_weight("kernel",
        shape=[int(input_shape[-1]),
        self.num_units])

    def call(self, inputs):
        derivative_number = tf.constant([ 2.0 ])
        
        ZeroPadding1D_front = tf.keras.layers.ZeroPadding1D(padding=( 1, 0 ))
        ZeroPadding1D_back = tf.keras.layers.ZeroPadding1D(padding=( 0, 1 ))

        reshape = tf.reshape( inputs, shape=(1, 1024, 1), name="Reshape" )
        subtract = tf.math.subtract( ZeroPadding1D_front( reshape ), ZeroPadding1D_back( reshape ), name="Subtract" )
        devide = tf.math.divide_no_nan( subtract, derivative_number, name="Devide" )

        # X = [ 1, 2, 3, 4, 5 ]
        # Y = 2
        # X/Y = [ ( 2 - 1 / 2 ), ( 3 - 2 / 2 ), ( 4 - 3 / 2 ), ( 5 - 4 / 2 ) ]
        # X/Y = [ 0.5, 0.5, 0.5, 0.5 ]

        return devide
        
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
start = 3
limit = 3075
delta = 3
sample = tf.range( start, limit, delta )
sample = tf.cast( sample, dtype=tf.float32 )
sample = tf.constant( sample, shape=( 1, 1, 1024 ), dtype=tf.float32 )
layer = MyLSTMLayer( 1024, True, False )

model = tf.keras.Sequential([
    tf.keras.Input(shape=(1, 1024)),
    layer,
])

model.summary()

print( "Sample: " )
print( sample )
print( "Predict: " )
print( model.predict(sample) )

輸出：

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 my_lstm_layer (MyLSTMLayer)  (1, 1025, 1)             1048576

=================================================================
Total params: 1,048,576
Trainable params: 1,048,576
Non-trainable params: 0
_________________________________________________________________
Sample:
tf.Tensor([[[3.000e 00 6.000e 00 9.000e 00 ... 3.066e 03 3.069e 03 3.072e 03]]], shape=(1, 1, 1024), dtype=float32)
Predict:
1/1 [==============================] - 0s 69ms/step
[[[-1.500e 00]
  [-1.500e 00]
  [-1.500e 00]
  ...
  [-1.500e 00]
  [-1.500e 00]
  [ 1.536e 03]]]

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/528380.html

標籤：Python张量流机器学习坡度自动分化

上一篇：ValueError：從XGBoost中`y`的唯一值推斷的無效類

下一篇：DecisionTreeClassifierTypeError：fit（）缺少1個必需的位置引數：'y'