TensorFlow如何處理傳遞給神經網路的訓練資料？-有解無憂

我從https://keras.io/examples/generation/wgan_gp/修改的代碼有問題。我的資料不是影像，而是一個 (1001,2) 順序資料陣列。第一列是時間，第二列是速度測量值。我收到此錯誤：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_14704/3651127346.py in <module>
     21 # Training the WGAN-GP model
     22 tic = time.perf_counter()
---> 23 WGAN.fit(dataset, batch_size=batch_Size, epochs=n_epochs, callbacks=[cbk])
     24 toc = time.perf_counter()
     25 time_elapsed(toc-tic)

~\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

~\Anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
   1145           except Exception as e:  # pylint:disable=broad-except
   1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
   1148             else:
   1149               raise

ValueError: in user code:

    File "C:\Users\sissonn\Anaconda3\lib\site-packages\keras\engine\training.py", line 1021, in train_function  *
        return step_function(self, iterator)
    File "C:\Users\sissonn\Anaconda3\lib\site-packages\keras\engine\training.py", line 1010, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\sissonn\Anaconda3\lib\site-packages\keras\engine\training.py", line 1000, in run_step  **
        outputs = model.train_step(data)
    File "C:\Users\sissonn\AppData\Local\Temp/ipykernel_14704/3074469771.py", line 141, in train_step
        gp = self.gradient_penalty(batch_size, x_real, x_fake)
    File "C:\Users\sissonn\AppData\Local\Temp/ipykernel_14704/3074469771.py", line 106, in gradient_penalty
        alpha = tf.random.uniform(batch_size,1,1)

    ValueError: Shape must be rank 1 but is rank 0 for '{{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=0, seed2=0](strided_slice)' with input shapes: [].

這是我的代碼：

import time
from tqdm.notebook import tqdm
import os

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Input

import numpy as np
import matplotlib.pyplot as plt

def define_generator(latent_dim):
    # This function creates the generator model using the functional API.
    
    # Layers...
    # Input Layer
    inputs = Input(shape=latent_dim, name='INPUT_LAYER')
    # 1st hidden layer
    x = Dense(50, activation='relu', name='HIDDEN_LAYER_1')(inputs)
    # 2nd hidden layer
    x = Dense(150, activation='relu', name='HIDDEN_LAYER_2')(x)
    # 3rd hidden layer
    x = Dense(300, activation='relu', name='HIDDEN_LAYER_3')(x)
    # 4th hidden layer
    x = Dense(150, activation='relu', name='HIDDEN_LAYER_4')(x)
    # 5th hidden layer
    x = Dense(50, activation='relu', name='HIDDEN_LAYER_5')(x)
    # Output layer
    outputs = Dense(2, activation='linear', name='OUPUT_LAYER')(x)
    # Instantiating the generator model
    model = Model(inputs=inputs, outputs=outputs, name='GENERATOR')
    
    return model


def generator_loss(fake_logits):
    # This function calculates and returns the WGAN-GP generator loss.
    
    # Expected value of critic ouput from fake images
    expectation_fake = tf.reduce_mean(fake_logits)
    # Loss to minimize
    loss = -expectation_fake
    
    return loss


def define_critic():
    # This function creates the critic model using the functional API.
    
    # Layers...
    # Input Layer
    inputs = Input(shape=2, name='INPUT_LAYER')
    # 1st hidden layer
    x = Dense(50, activation='relu', name='HIDDEN_LAYER_1')(inputs)
    # 2nd hidden layer
    x = Dense(150, activation='relu', name='HIDDEN_LAYER_2')(x)
    # 3rd hidden layer
    x = Dense(300, activation='relu', name='HIDDEN_LAYER_3')(x)
    # 4th hidden layer
    x = Dense(150, activation='relu', name='HIDDEN_LAYER_4')(x)
    # 5th hidden layer
    x = Dense(50, activation='relu', name='HIDDEN_LAYER_5')(x)
    # Output layer
    outputs = Dense(1, activation='linear', name='OUPUT_LAYER')(x)
    # Instantiating the critic model
    model = Model(inputs=inputs, outputs=outputs, name='CRITIC')
    
    return model


def critic_loss(real_logits, fake_logits):
    # This function calculates and returns the WGAN-GP critic loss.
    
    # Expected value of critic output from real images
    expectation_real = tf.reduce_mean(real_logits)
    # Expected value of critic output from fake images
    expectation_fake = tf.reduce_mean(fake_logits)
    # Loss to minimize
    loss = expectation_fake - expectation_real
    
    return loss


class define_wgan(keras.Model):
    # This class creates the WGAN-GP object.
    # Attributes:
    #     critic = the critic model.
    #     generator = the generator model.
    #     latent_dim = defines generator input dimension.
    #     critic_steps = defines how many times the discriminator gets trained for each training cycle.
    #     gp_weight = defines and returns the critic gradient for the gradient penalty term.
    # Methods:
    #     compile() = defines the optimizer and loss function of both the critic and generator.
    #     gradient_penalty() = calcuates and returns the gradient penalty term in the WGAN-GP loss function.
    #     train_step() = performs the WGAN-GP training by updating the critic and generator weights
    #         and returns the loss for both. Called by fit().
     
    def __init__(self, gen, critic, latent_dim, n_critic_train, gp_weight):
        super().__init__()
        self.critic = critic
        self.generator = gen
        self.latent_dim = latent_dim
        self.critic_steps = n_critic_train
        self.gp_weight = gp_weight
    
    def compile(self, generator_loss, critic_loss):
        super().compile()
        self.generator_optimizer = keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5, beta_2=0.9)
        self.critic_optimizer = keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5, beta_2=0.9)
        self.generator_loss_function = generator_loss
        self.critic_loss_function = critic_loss
        
    def gradient_penalty(self, batch_size, x_real, x_fake):
        
        # Random uniform samples of points between distribution.
        # "alpha" must be a tensor so that "x_interp" will also be a tensor.
        alpha = tf.random.uniform(batch_size,1,1)
        # Data interpolated between real and fake distributions
        x_interp = alpha*x_real   (1-alpha)*x_fake
        # Calculating critic output gradient wrt interpolated data
        with tf.GradientTape() as gp_tape:
            gp_tape.watch(x_interp)
            critc_output = self.discriminator(x_interp, training=True)
        grad = gp_tape.gradient(critic_output, x_interp)[0]
        # Calculating norm of gradient
        grad_norm = tf.sqrt(tf.reduce_sum(tf.square(grad)))
        # calculating gradient penalty
        gp = tf.reduce_mean((norm - 1.0)**2)
        
        return gp

    def train_step(self, x_real):
        # Critic training
        # Getting batch size for creating latent vectors
        print(x_real)
        batch_size = tf.shape(x_real)[0]
        print(batch_size)
        # Critic training loop
        for i in range(self.critic_steps):
            # Generating latent vectors
            latent = tf.random.normal(shape=(batch_size, self.latent_dim))
            with tf.GradientTape() as tape:
                # Obtaining fake data from generator
                x_fake = self.generator(latent, training=True)
                # Critic output from fake data
                fake_logits = self.critic(x_fake, training=True)
                # Critic output from real data
                real_logits = self.critic(x_real, training=True)
                # Calculating critic loss
                c_loss = self.critic_loss_function(real_logits, fake_logits)
                # Calcuating gradient penalty
                gp = self.gradient_penalty(batch_size, x_real, x_fake)
                # Adjusting critic loss with gradient penalty
                c_loss = c_loss   gp_weight*gp
            # Calculating gradient of critic loss wrt critic weights
            critic_grad = tape.gradient(c_loss, self.critic.trainable_variables)
            # Updating critic weights
            self.critic_optimizer.apply_gradients(zip(critic_gradient, self.critic.trainable_variables))
        # Generator training
        # Generating latent vectors
        latent = tf.random.normal(shape=(batch_size, self.latent_dim))
        with tf.GradientTape() as tape:
            # Obtaining fake data from generator
            x_fake = self.generator(latent, training=True)
            # Critic output from fake data
            fake_logits = self.critic(x_fake, training=True)
            # Calculating generator loss
            g_loss = self.generator_loss_function(fake_logits)
        # Calculating gradient of generator loss wrt generator weights
        genertor_grad = tape.gradient(g_loss, self.generator.trainable_variables)
        # Updating generator weights
        self.generator_optimizer.apply_gradients(zip(generator_gradient, self.generator.trainable_variables))
        
        return g_loss, c_loss

class GAN_monitor(keras.callbacks.Callback):
    def __init__(self, n_samples, latent_dim):
        self.n_samples = n_samples
        self.latent_dim = latent_dim

    def on_epoch_end(self, epoch, logs=None):
        latent = tf.random.normal(shape=(self.n_samples, self.latent_dim))
        generated_data = self.model.generator(latent)
        plt.plot(generated_data)
        plt.savefig('Epoch _' str(epoch) '.png', dpi=300)

data = np.genfromtxt('Flight_1.dat', dtype='float', encoding=None, delimiter=',')[0:1001,0]
time_span = np.linspace(0,20,1001)
dataset = np.concatenate((time_sapn[:,np.newaxis], data[:,np.newaxis]), axis=1)
dataset.shape

# Training Parameters
latent_dim = 100
n_epochs = 10
n_critic_train = 5
gp_weight = 10
batch_Size = 100

# Instantiating the generator and discriminator models
gen = define_generator(latent_dim)
critic = define_critic()

# Instantiating the WGAN-GP object
WGAN = define_wgan(gen, critic, latent_dim, n_critic_train, gp_weight)

# Compling the WGAN-GP model
WGAN.compile(generator_loss, critic_loss)

# Instantiating custom Keras callback
cbk = GAN_monitor(n_samples=1, latent_dim=latent_dim)

# Training the WGAN-GP model
tic = time.perf_counter()
WGAN.fit(dataset, batch_size=batch_Size, epochs=n_epochs, callbacks=[cbk])
toc = time.perf_counter()
time_elapsed(toc-tic)

這個問題是我提供給 tf.random.rand() 用于分配 alpha 的形狀。我不完全理解為什么 Keras 示例中的形狀輸入是 (batch_size, 1, 1, 1)。所以我不知道如何為我的示例指定形狀。此外，我不理解 Keras 示例中的這一行：

batch_size = tf.shape(real_images)[0]

在此示例中，“real_images”是一個 (60000, 28, 28, 1) 陣列，它被傳遞給 fit() 方法，然后將其傳遞給 train_step() 方法。（它作為“train_images”傳遞，但它們是同一個變數。）如果我在這個 tf.shape() 之前添加一行列印出“real_images”，這就是它產生的結果：

Tensor("IteratorGetNext:0", shape=(None, 28, 28, 1), dtype=float32)

為什么 60000 現在沒有了？然后，我在 tf.shape() 之后添加了一行列印出“batch_size”，這就是它產生的結果：

Tensor("strided_slice:0", shape=(), dtype=int32)

我用谷歌搜索了“tf strided_slice”，但我能找到的只是方法 tf.strided_slice()。那么“batch_size”的值究竟是什么，為什么當它們是張量時，變數的輸出會如此模棱兩可呢？事實上，我輸入：

tf.shape(train_images)[0]

在 Jupyter 筆記本的另一個單元格中。我得到一個完全不同的輸出：

<tf.Tensor: shape=(), dtype=int32, numpy=60000>

我真的需要了解這個 Keras 示例才能成功地為我的資料實作此代碼。任何幫助表示贊賞。

順便說一句：我現在只使用一組資料，但是一旦我讓 GAN 運行，我將提供多組這些 (1001,2) 資料集。此外，如果您想自己測驗代碼，用任何 (1001,2) numpy 陣列替換“資料集”變數就足夠了。謝謝你。

uj5u.com熱心網友回復：

“為什么 60000 現在是 None？”：在定義 TensorFlow 模型時，第一個維度（batch_size）是 None。了解 TensorFlow 發生的事情以及它如何使用圖進行計算可能非常復雜。但是為了您現在的理解，您只需要知道在定義模型時不需要指定batch_size，因此沒有。這是必不可少的，因為它允許模型定義一次，然后使用任意數量的示例進行訓練并應用于資料集。例如，在訓練時，您可能一次為模型提供一批 256 張影像，但在使用訓練后的模型進行推理時，您很可能只希望輸入是單個影像。因此，輸入大小的第一維的實際值僅在計算開始時才重要。

'我不完全理解為什么 Keras 示例中的形狀輸入是 (batch_size, 1, 1, 1)'：這種大小的原因是您希望為每個影像使用不同的隨機值 alpha。您有 batch_size 影像數量，因此在第一個維度中有 batch_size，但它只是張量格式的單個值，因此在所有其他維度中它只需要大小 1。它總共有 4 個維度的原因是它可以用于計算輸入，這些輸入是 4-D 影像張量，對于具有 3 RGB 的彩色影像，其形狀類似于 (batch_size, img_h, img_w, 3)渠道。

就理解您的錯誤而言Shape must be rank 1 but is rank 0，這表示您正在使用的函式 -tf.random.uniform需要一個 1 級張量，即具有 1 維的東西，但正在傳遞一個 0 級張量，即一個標量值。從您的代碼中，您可能只是將值batch_size而不是張量傳遞給它。這可能會起作用：

alpha = tf.random.uniform([batch_size, 1, 1, 1])

這個函式的第一個引數是它的形狀，所以有它很重要[]。查看有關此功能的檔案，以確保您正確使用它 - https://www.tensorflow.org/api_docs/python/tf/random/uniform。

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/489703.html

標籤：Python 张量流喀拉斯生成对抗网络

上一篇：如何在Typescript中將tf.Tensor型別轉換為數字型別

下一篇：ModuleNotFoundError：嘗試在本地擬合sagemakertensorflow估計器時沒有名為“yaml”的模塊