y應該是一個1d陣列，得到了一個shape()的陣列來代替 -有解無憂

我已經訓練并保存了一個模型。我正試圖在新的資料上進一步訓練該模型，但它給出了錯誤。代碼的相關部分：

from tensorflow.keras.preprocessing.text import Tokenizer
# The maximum number of words to be used. (最頻繁)
MAX_NB_WORDS = 50000
# 每個投訴中的最大字數。
MAX_SEQUENCE_LENGTH =250
# This is fixed.
EMBEDDING_DIM =100
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, filters='!"#$%&()* ,-./:;<=>?@[]^_`{|}~'/span>, lower=True)
tokenizer.fit_on_texts(master_df['Observation']. values)
word_index = tokenizer.word_index

from sklearn.feature_extraction.text import CountVectorizer
cv=CountVectorizer(max_df=1. 0,min_df=1, stop_words=stop_words, max_features=10000, ngram_range=（1,3）)
X=cv.fit_transform(X)

with open("./sgd.pickle", 'rb') as f:
    sgd = pickle.load(f)

def output_sample（sentence）。
    test=preprocess_text(sentence)
    test=test.lower()
    #print(test)/span>
    test=[測驗] 
    tokenizer.fit_on_sequences(test)
    new_words= tokenizer.word_index
    #print(word_index)``
    test1=cv.transform(test)
    #print(test1)
    output=sgd.predict(test1)
    return output[0]

def retrain（X,y）。
    X=preprocess_text(X)
    X=X.lower()
    X=[X]
    tokenizer.fit_on_texts(X)
    new_words=tokenizer.word_index
    X=cv.fit_transform(X)
    sgd.fit(X,y)
    with open('sgd.pickle', 'wb') as f:
        pickle.dump(sgd, f)
    print("Model trained on new data")

sentence=input("

輸入你的觀察。

")
output=output_sample(句子)
print("

風險預測是",preprocess_text(output),"

")

print("上述預測是否正確？
")
corr=input("按'y'代表是，按'n'代表不是。
")

if corr=='y'/span>:
    newy=np.array(output)
    retrain(sentence,newy)

elif corr=='n'/span>:

    print("什么是正確的風險？
1. 低
2. 中等
")
    r=input("輸入適當的數字：")

    if r=='1:
        newy=np.array('Low')
        retrain(sentence,newy)
    elif r=='2':
        newy=np.array('Medium')
        重新訓練(sentence,newy)
    else:
        print("不正確的輸入。請重新啟動應用程式。")

else:
    print("不正確的輸入。請重新啟動應用程式")

當程式運行時，錯誤發生在sgd.fit(X,y)。這個錯誤是

---------------------------------------------------------------------------
ValueError 回溯(最近一次呼叫)
~AppDataLocalTemp/ipykernel_11300/3528077041.py in<module>
      5 newy=[output] 。
      6 print(newy)
----> 7 retrain(sentence,newy)
      8 
      9 elif corr=='n'/span>:

~AppDataLocalTemp/ipykernel_11300/2433836763.py in retrain(X, y)
      7 X=cv.fit_transform(X)
      8 #y = y.reshape((-1, 1))
----> 9 sgd.fit(X,y)
     10 with open('sgd.pickle'/span>, 'wb'/span>) as f:
     11 pickle.dump(sgd, f)

~AppDataLocalProgramsPythonPython38libsite-packagessklearnpipeline.py in fit(self, X, y, **fit_params)
    344 if self._final_estimator != 'passthrough'。
    345 fit_params_last_step = fit_params_steps[self.steps[-1][0] ]
--> 346 self._final_estimator.fit(Xt, y, **fit_params_last_step)
    347 
    348 return self

~AppDataLocalProgramsPythonPython38libsite-packagessklearnlinear_model\_stochastic_gradient.py in fit(self, X, y, coef_init, intercept_init, sample_weight)
    727 回傳一個self的實體。
    728 ""
--> 729 return self._fit(X, y, alpha=self.alpha, C=1.0,
    730 loss=self.loss, learning_rate=self.learning_rate。
    731 coef_init=coef_init, intercept_init=intercept_init。

~AppDataLocalProgramsPythonPython38libsite-packagessklearnlinear_model\_stochastic_gradient.py in _fit(self, X, y, alpha, C, loss, learning_rate, coef_init, intercept_init, sample_weight)
    567 self.t_ = 1.0
    568 
--> 569 self._partial_fit(X, y, alpha, C, loss, learning_rate, self.max_iter,
    570類，sample_weight，coef_init，intercept_init)
    571 

~AppDataLocalProgramsPythonPython38libsite-packagessklearnlinear_model\_stochastic_gradient.py in _partial_fit(self, X, y, alpha, C, loss, learning_rate, max_iter, classes, sample_weight, coef_init, intercept_init)
    529 max_iter=max_iter)
    530 else:
--> 531 raise ValueError(
    532 "類的數量必須大于1；"
    533 "得到了%d類" % n_classes)

ValueError: 類的數量必須大于1；得到1個類

資料樣本如下：

觀察風險
0 一條單獨的道路用于輕型車輛，應該是一個單獨的道路。  低
2 所有長椅都沒有足夠的護堤。       低
3 由于燈光安排是 不夠。              低度
4 由于燈光安排是 不夠。              低度
5 作為合同方r的設備記錄是不可以的。  低
77 急救室沒有建立。                 中
98 運輸道路上的重度粉塵被發現在足夠的范圍內。  中
79 急救站維持在休息區。  中
171 目前沒有炸藥車可以使用。  中
79 急救站維持在休息區。  中

在理想情況下，它應該接受輸入，但我不知道為什么會出現這種錯誤。

uj5u.com熱心網友回復：

我清理了代碼，并對retrain函式做了一些修改，現在該函式將向訓練集添加一個新的String和Label，并再次適合分類器。你的代碼的其他部分在邏輯上保持不變！

實用函式：

def output_sample（sentence）。
    test=preprocess_text(sentence)
    test=test.lower()
    test=[測驗] 
    tokenizer.fit_on_sequences(test)
    new_words= tokenizer.word_index
    test1=cv.transform(test)
    output=sgd.predict(test1)
    return output[0]

def preprocess_text（string）。
    # do whatever you want but return String afterward;).
    return string

def retrain(X,y) 。
    X=preprocess_text(X)
    X=X.lower()
    X=[X]
    X = cv.fit_transform(master_df['observation'] X)
    new_words=tokenizer.word_index
    sgd.fit(X,master_df['風險'] y)
    with open('sgd.pickle'，'wb') as f:
        pickle.dump(sgd, f)
    print("Model trained on new data")

實際流程：

import numpy as np 
import pickle
import nltk
from sklearn.feature_extraction.text import CountVectorizer
stopwords = nltk.corpus.stopwords.words('chinese')
cv=CountVectorizer(max_df=1. 0,min_df=1, stop_words=stopwords, max_features=10000, ngram_range=（1,3）)
master_df = pd.read_csv('classification.tsv')
X=cv.fit_transform(master_df['Observation'] )
from sklearn.linear_model import SGDClassifier

try:
    f = open("./sgd.pickle"/span>, 'rb'/span>)
    sgd = pickle.load(f)
except:
    sgd = SGDClassifier()

sgd.fit(X, master_df['Risk'].to_list()


sentence=input("

輸入你的觀察。

")
output=output_sample(句子)
print("

風險預測是",preprocess_text(output),"

")

print("上述預測是否正確？
")
corr=input("按'y'代表是，按'n'代表不是。
")

if corr=='y'/span>:
    newy=np.array(output)
    retrain(sentence, newy)

elif corr=='n'/span>:

    print("正確的風險是什么？
1. 低
2. 中等
")
    r=input("輸入適當的數字：")

    if r=='1:
        newy=np.array('Low')
        retrain(sentence,newy)
    elif r=='2':
        newy=np.array('Medium')
        重新訓練(sentence,newy)
    else:
        print("不正確的輸入。請重新啟動應用程式。")

else:
    print("不正確的輸入。請重新啟動應用程式")

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/316567.html

標籤：

上一篇：多個Kivy下拉串列的奇怪錯誤

下一篇：什么時候應該運行wandb.watch，使權重和偏差正確跟蹤引數和梯度？