我從 Huggingface 構建了一個 BERT 模型(Bert-base-multilingual-cased),并希望評估模型的精度、召回率和 F1 分數以及準確度,因為準確度并不總是評估的最佳指標。
這是我為我的用例修改的示例筆記本。
創建訓練/測驗資料:
from transformers import BertTokenizer, TFBertModel, TFBertForSequenceClassification
TEST_SPLIT = 0.1
BATCH_SIZE = 2
train_size = int(len(x) * (1-TEST_SPLIT))
tfdataset = tfdataset.shuffle(len(x))
tfdataset_train = tfdataset.take(train_size)
tfdataset_test = tfdataset.skip(train_size)
tfdataset_train = tfdataset_train.batch(BATCH_SIZE)
tfdataset_test = tfdataset_test.batch(BATCH_SIZE)
構建模型:
MODEL_NAME = 'bert-base-multilingual-cased'
N_EPOCHS = 2
model = TFBertForSequenceClassification.from_pretrained(MODEL_NAME)
optimizer = optimizers.Adam(learning_rate=3e-5)
loss = losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])
model.fit(tfdataset_train, batch_size=BATCH_SIZE, epochs=N_EPOCHS)
示例輸出:
All model checkpoint layers were used when initializing TFBertForSequenceClassification.
Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Epoch 1/2
415/415 [==============================] - 741s 2s/step - loss: 0.6652 - accuracy: 0.6321
Epoch 2/2
415/415 [==============================] - 717s 2s/step - loss: 0.6619 - accuracy: 0.6429
<keras.callbacks.History at 0x7fc970d72750>
評價:
benchmarks = model.evaluate(tfdataset_test, return_dict=True, batch_size=BATCH_SIZE)
print(benchmarks)
示例輸出:
93/93 [==============================] - 42s 404ms/step - loss: 0.6536 - accuracy: 0.6108
{'loss': 0.6535539627075195, 'accuracy': 0.6108108162879944}
有了這個,我就得到了準確度分數。但是我想要一份包含所有提到的指標的分類報告。
有誰知道如何使用這樣的“tfdatasets”來做到這一點?
提前致謝!
uj5u.com熱心網友回復:
最簡單的方法是tensorflow-addons在屬于tfmain/base 包的指標之外使用。
#pip install tensorflow-addons
import tensorflow as tf
import tensorflow_addons as tfa
....
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.Accuracy(),
tf.keras.metrics.Precision(),
tf.keras.metrics.Recall(),
tfa.metrics.F1Score(num_classes=nb_classes,
average='macro',
threshold=0.5))
uj5u.com熱心網友回復:
這對我有用(在這里找到):
from keras import backend as K
def recall_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives =
K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives
K.epsilon())
return recall
def precision_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives =
K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives K.epsilon())
return precision
def f1_m(y_true, y_pred):
precision = precision_m(y_true, y_pred)
recall = recall_m(y_true, y_pred)
return 2*((precision*recall)/(precision recall K.epsilon()))
# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc',f1_m,precision_m, recall_m])
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/407301.html
標籤:
下一篇:如何決議多級(最多5級)嵌套JSON物件并在將其本地存盤在核心資料中后使用swift在tableview/SwiftUI中顯示它?
