tokenizer = Tokenizer()
tokenizer.fit_on_texts(X_train)
encoded_docs = tokenizer.texts_to_sequences(X_train)
padded_sequence = pad_sequences(encoded_docs, maxlen=60)
test_tweets = tokenizer.texts_to_sequences(X_test)
test_padded_sequence = pad_sequences(test_tweets, maxlen=60)
即使我沒有提供oov_token引數,我也沒有收到任何錯誤代碼。我預計會出錯test_tweets = tokenizer.texts_to_sequences(X_test)
當您不提供 tensorflow 在測驗期間如何處理詞匯不足的單詞oov_token?
uj5u.com熱心網友回復:
默認情況下,OOV 單詞將被忽略/丟棄,如果oov_token是None:
import tensorflow as tf
tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(['hello world'])
print(tokenizer.word_index)
sequences = tokenizer.texts_to_sequences(['hello friends'])
print(sequences)
{'hello': 1, 'world': 2}
[[1]]
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/453722.html
上一篇:swagger的作用和配置使用
下一篇:swagger的作用和配置使用
