我使用 Deeplearning4j 對設備名稱進行分類。我用 495 個類別標記了大約 50,000 個專案,并使用這些資料來訓練神經網路。
也就是說,作為輸入,我提供了一組由 0 和 1 組成的向量 (50,000),以及每個向量的預期類別(0 到 494)。
我使用 IrisClassifier 示例作為代碼的基礎。
我將訓練好的模型保存到一個檔案中,現在我可以用它來預測設備的類別。
例如,我嘗試使用我用于訓練的相同資料(50,000 個專案)進行預測,并將預測與我對這些資料的標記進行比較。
結果證明效果非常好,神經網路的誤差為~1%。
之后,我嘗試使用這 50,000 條記錄中的前 100 個向量進行預測,并洗掉其余的 49900 個向量。
并且對于這 100 個向量,與 50,000 個組合中相同的 100 個向量的預測相比,預測是不同的。
也就是說,我們提供給訓練模型的資料越少,預測誤差就越大。
即使對于完全相同的向量。
為什么會發生這種情況?
我的代碼。
訓練:
//First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
recordReader.initialize(new FileSplit(new File(args[0])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int labelIndex = 3331;
int numClasses = 495;
int batchSize = 4000;
// DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).classification(labelIndex, numClasses).build();
List<DataSet> trainingData = new ArrayList<>();
List<DataSet> testData = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
allData.shuffle();
SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.8); //Use 80% of data for training
trainingData.add(testAndTrain.getTrain());
testData.add(testAndTrain.getTest());
}
DataSet allTrainingData = DataSet.merge(trainingData);
DataSet allTestData = DataSet.merge(testData);
//We need to normalize our data. We'll use NormalizeStandardize (which gives us mean 0, unit variance):
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(allTrainingData); //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
normalizer.transform(allTrainingData); //Apply normalization to the training data
normalizer.transform(allTestData); //Apply normalization to the test data. This is using statistics calculated from the *training* set
long seed = 6;
int firstHiddenLayerSize = labelIndex/6;
int secondHiddenLayerSize = firstHiddenLayerSize/4;
//log.info("Build model....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.activation(Activation.TANH)
.weightInit(WeightInit.XAVIER)
.updater(new Sgd(0.1))
.l2(1e-4)
.list()
.layer(new DenseLayer.Builder().nIn(labelIndex).nOut(firstHiddenLayerSize)
.build())
.layer(new DenseLayer.Builder().nIn(firstHiddenLayerSize).nOut(secondHiddenLayerSize)
.build())
.layer( new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation(Activation.SOFTMAX) //Override the global TANH activation with softmax for this layer
.nIn(secondHiddenLayerSize).nOut(numClasses).build())
.build();
//run the model
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
//record score once every 100 iterations
model.setListeners(new ScoreIterationListener(100));
for(int i=0; i<5000; i ) {
model.fit(allTrainingData);
}
//evaluate the model on the test set
Evaluation eval = new Evaluation(numClasses);
INDArray output = model.output(allTestData.getFeatures());
eval.eval(allTestData.getLabels(), output);
log.info(eval.stats());
// Save the Model
File locationToSave = new File(args[1]);
model.save(locationToSave, false);
預言:
// Open the network file
File locationToLoad = new File(args[0]);
MultiLayerNetwork model = MultiLayerNetwork.load(locationToLoad, false);
model.init();
// First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
int numLinesToSkip = 0;
char delimiter = ',';
// Data to predict
CSVRecordReader recordReader = new CSVRecordReader(numLinesToSkip, delimiter); //skip no lines at the top - i.e. no header
recordReader.initialize(new FileSplit(new File(args[1])));
//Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
int batchSize = 4000;
DataSetIterator iterator = new RecordReaderDataSetIterator.Builder(recordReader, batchSize).build();
List<DataSet> dataSetList = new ArrayList<>();
while (iterator.hasNext()) {
DataSet allData = iterator.next();
dataSetList.add(allData);
}
DataSet dataSet = DataSet.merge(dataSetList);
DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit(dataSet);
normalizer.transform(dataSet);
// Now use it to classify some data
INDArray output = model.output(dataSet.getFeatures());
// Save result
BufferedWriter writer = new BufferedWriter(new FileWriter(args[2], true));
for (int i=0; i<output.rows(); i ) {
writer
.append(output.getRow(i).argMax().toString())
.append(" ")
.append(String.valueOf(i))
.append(" ")
.append(output.getRow(i).toString())
.append('\n');
}
writer.close();
uj5u.com熱心網友回復:
確保在模型旁邊按如下方式保存規范化器:
import org.nd4j.linalg.dataset.api.preprocessor.serializer.NormalizerSerializer;
NormalizerSerializer SUT = NormalizerSerializer.getDefault();
SUT.write(normalizer,new File("outputFile.bin"));
NormalizeStandardize restored = SUT.restore(new File("outputFile.bin");
uj5u.com熱心網友回復:
您需要使用相同的歸一化資料進行訓練和預測。否則在轉換資料時會使用錯誤的統計資料。
您目前的做法會導致資料看起來與訓練資料非常不同,這就是為什么您會得到如此不同的結果。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/377467.html
下一篇:BigQuery線性回歸引數
