TensorflowfitValueError:Shapemismatch:theshapeoflabels(received(16640,))應該等于logits的形狀，除了最后一個維度-有解無憂

我從轉換為序列的標記化文本然后 numpy 陣列創建了一個 tf 資料集

tokenizer = Tokenizer()
tokenizer.fit_on_texts(bible_text)#Builds the word index
sequences = tokenizer.texts_to_sequences(bible_text)

##-->[[5, 1, 914, 32, 1352, 1, 214, 2, 1, 111],
## [2, 1, 111, 31, 252, 2091, 2, 1874, 2, 547, 31, 38, 1, 196, 3, 1, 899, 2, 1, 298, 3, 32, 878, 38, 1, 196, 3, 1, 266],
## [2, 32, 33, 79, 54, 16, 369, 2, 54, 31, 369], [2, 32, 215, 1, 369, 6, 17, 31, 156, 2, 32, 955, 1, 369, 34, 1, 547], ...]

sequences=pad_sequences(sequences, padding='post')

##-->[[   5    1  914   32 1352    1  214    2    1  111    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0]
##...]

word_index=tokenizer.word_index 

##for k,v in sorted(word_index.items(), key=operator.itemgetter(1))[:10]:
##   print (k,v)

##--> the 1
##and 2
##of 3
##to 4
##in 5
##that 6
##shall 7
##he 8
##lord 9
##his 10
##
##[...]

vocab_size = len(tokenizer.word_index)   1

構建輸入和目標序列

input_sequences, target_sequences = sequences[:,:-1], sequences[:,1:]
seq_length=input_sequences.shape[1] ##-->89
num_verses=input_sequences.shape[0]

input_sequences=np.array(input_sequences)
target_sequences=np.array(target_sequences)

和資料集

dataset= tf.data.Dataset.from_tensor_slices((input_sequences, target_sequences))

這個資料集設定似乎沒有什么特別的錯誤。我在這里定義模型

EPOCHS=2
BATCH_SIZE=256
VAL_FRAC=0.2  
LSTM_UNITS=1024
DENSE_UNITS=vocab_size
EMBEDDING_DIM=256
BUFFER_SIZE=10000

len_val=int(num_verses*VAL_FRAC)

#build validation dataset
validation_dataset = dataset.take(len_val)
validation_dataset = (
    validation_dataset
    .shuffle(BUFFER_SIZE)
    .padded_batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

#build training dataset
train_dataset = dataset.skip(len_val)
train_dataset = (
    train_dataset
    .shuffle(BUFFER_SIZE)
    .padded_batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

#build the model: 2 stacked LSTM
print('Build model...')
model = tf.keras.Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM))
model.add(LSTM(LSTM_UNITS, return_sequences=True, input_shape=(seq_length, vocab_size)))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(DENSE_UNITS))
model.add(Activation('softmax'))

loss=tf.losses.SparseCategoricalCrossentropy(from_logits=False)

model.compile(optimizer='adam',
              loss=loss,
              metrics=[
                  tf.keras.metrics.SparseCategoricalAccuracy()]
              )

model.summary()

我收到以下錯誤 - 它屬于 fit 方法

ValueError: Shape mismatch: The shape of labels (received (16640,)) should equal the shape of logits except for the last dimension (received (256, 3067)).

任何想法，可能有什么問題？

編輯

如果我更改為 categorical_crossentropy 損失我得到

   /usr/local/lib/python3.6/dist-packages/keras/backend.py:4839 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1161 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (256, 65) and (256, 3067) are incompatible

編輯

我使用了 AloneTogether 指示的模型，這解決了擬合步驟。但是我在對新資料進行預測時遇到了問題

preds = model.predict(x, verbose=0)[0][0]

因為預測的總和不完全等于 1

>>> preds
array([1.6435336e-04, 1.4827750e-04, 1.4495676e-04, ..., 8.9204557e-05,
       8.9799374e-05, 8.7148059e-05], dtype=float32)
>>> sum(preds)
1.0000000457002898

這似乎就是為什么我不能從這個“分布”中取樣的原因

def sample(a, temperature=1.0):
    #helper function to sample an index from a probability array
    a = np.log(a) / temperature
    a = np.exp(a) / np.sum(np.exp(a))
    return np.argmax(np.random.multinomial(1, a, 1))

任何線索為什么這種行為，任何解決方法？

uj5u.com熱心網友回復：

您的預處理步驟看起來不錯。假設您想生成一個序列作為您的輸出（您的目標是序列），請嘗試按如下方式調整您的模型：

model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(vocab_size, EMBEDDING_DIM))
model.add(tf.keras.layers.LSTM(LSTM_UNITS, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.LSTM(512, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(DENSE_UNITS, activation='softmax')))

請注意，您的最后LSTM一層現在再次回傳序列。時間分布層簡單地將帶有 softmax 激活函式的全連接層應用于每個時間步，i以計算詞匯表中每個單詞的概率。每個全連接層使用的節點數等于詞匯量的大小，以便為每個單詞提供公平的預測機會。

要根據某些輸入從分布中采樣，您可以執行以下操作：

temperature = 1.0
sample = input_sequences[0] # "You are unsure whether or not to trust him but very thankful that you wore a turtle neck"
sample = tf.expand_dims(sample, axis=0)
predictions = model.predict(sample) / temperature
index_word=tokenizer.index_word 

predictions = tf.squeeze(predictions, axis=0)
sampled_indices = tf.random.categorical(predictions, num_samples=1)
word_list = list(np.vectorize(index_word.get)(sampled_indices))

print(sampled_indices)
print(word_list)

'''
tf.Tensor(
[[ 7]
 [45]
 [52]
 [41]
 [29]
 [21]
 [21]
 [35]
 [27]
 [ 6]
 [38]
 [44]
 [25]
 [39]
 [13]
 [19]
 [26]], shape=(17, 1), dtype=int64)
[array(['about'], dtype='<U7'), array(['thorns'], dtype='<U7'), array(['would'], dtype='<U7'), array(['is'], dtype='<U7'), array(['but'], dtype='<U7'), array(['by'], dtype='<U7'), array(['by'], dtype='<U7'), array(['all'], dtype='<U7'), array(['to'], dtype='<U7'), array(['she'], dtype='<U7'), array(['wander'], dtype='<U7'), array(['have'], dtype='<U7'), array(['whether'], dtype='<U7'), array(['lost'], dtype='<U7'), array(['are'], dtype='<U7'), array(['your'], dtype='<U7'), array(['or'], dtype='<U7')]
'''

當然，我訓練的模型會吐出胡言亂語，因為它在 10 個樣本上訓練了 2 個時期，但希望你能理解。我使用采樣器函式 ( tf.random.categorical) 從每個時間步長的溫度加權 softmax 函式產生的多項式分布中進行采樣。例如，假設w是基于詞匯表的時間步長 1 的概率分布v。采樣器函式采用w并繪制一個整數值，表示在該多項式分布中具有高概率的單詞。我希望你能明白。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/335343.html

標籤：Python 张量流数据集

上一篇：沒有這樣的檔案或目錄：'Tensorflow/workspace/annotations\\label_map.pbtxtonJupyter為什么我的代碼不起作用？

下一篇：bash變數只保留grep的最后一行