pytorch+PyQt5實戰:ResNet-18實作CLFAR-10影像分類,并利用PyQt5進行人機界面顯示
實驗環境:
1.pytorch-1.6.0
2.python-3.7.9
3.window-10
4.pycharm
5.pyqt5(相應的QT Designer及工具包)
CLFAR-10的資料集
作為一個初學者,在官網下載CLFAR-10的資料集下載速度不僅慢,而且不是常用的圖片格式,這里是轉換后的資料集,有需要的可以直接百度云盤提取,
鏈接:https://pan.baidu.com/s/1l7wvWLCscPcGoKzRjggjRA
提取碼:ht88
ResNet-18網路:
ResNet全名Residual Network殘差網路,殘差網路是由何凱明所提出的,他的《Deep Residual Learning for Image Recognition》獲得了當年CVPR最佳論文,他提出的深度殘差網路在2015年可以說是洗刷了影像方面的各大比賽,以絕對優勢取得了多個比賽的冠軍,而且它在保證網路精度的前提下,將網路的深度達到了152層,后來又進一步加到1000的深度,我們這里用到的是一個18 層的殘差網路,
網路結構如下:

殘差學習:一個構建單元

在pytorch上搭建ResNet-18模型
一、新建resnet.py檔案
代碼如下:
import torch.nn as nn
import torch.nn.functional as F
class ResidualBlock(nn.Module):
def __init__(self, inchannel, outchannel, stride=1):
super(ResidualBlock, self).__init__()
self.left = nn.Sequential(
nn.Conv2d(inchannel, outchannel, kernel_size=3, stride=stride, padding=1, bias=False),
nn.BatchNorm2d(outchannel),
nn.ReLU(inplace=True),
nn.Conv2d(outchannel, outchannel, kernel_size=3, stride=1, padding=1, bias=False),
nn.BatchNorm2d(outchannel)
)
self.shortcut = nn.Sequential()
if stride != 1 or inchannel != outchannel:
self.shortcut = nn.Sequential(
nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(outchannel)
)
def forward(self, x):
out = self.left(x)
out += self.shortcut(x)
out = F.relu(out)
return out
class ResNet(nn.Module):
def __init__(self, ResidualBlock, num_classes=10):
super(ResNet, self).__init__()
self.inchannel = 64
self.conv1 = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(),
)
self.layer1 = self.make_layer(ResidualBlock, 64, 2, stride=1)
self.layer2 = self.make_layer(ResidualBlock, 128, 2, stride=2)
self.layer3 = self.make_layer(ResidualBlock, 256, 2, stride=2)
self.layer4 = self.make_layer(ResidualBlock, 512, 2, stride=2)
self.fc = nn.Linear(512, num_classes)
def make_layer(self, block, channels, num_blocks, stride):
strides = [stride] + [1] * (num_blocks - 1) #strides=[1,1]
layers = []
for stride in strides:
layers.append(block(self.inchannel, channels, stride))
self.inchannel = channels
return nn.Sequential(*layers)
def forward(self, x):
out = self.conv1(x)
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = F.avg_pool2d(out, 4)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
def ResNet18():
return ResNet(ResidualBlock)
一開始沒看懂下面代碼的意思,后來看懂模型結構發現是真香,大家細品,
self.shortcut = nn.Sequential()
if stride != 1 or inchannel != outchannel:
self.shortcut = nn.Sequential(
nn.Conv2d(inchannel, outchannel, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(outchannel)
)
二、新建train.py檔案
代碼如下:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import argparse
from resnet import ResNet18
# 定義是否使用GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 引數設定,使得我們能夠手動輸入命令列引數,就是讓風格變得和Linux命令列差不多
parser = argparse.ArgumentParser(description='PyTorch CIFAR10 Training')
parser.add_argument('--outf', default='./model/', help='folder to output images and model checkpoints') #輸出結果保存路徑
parser.add_argument('--net', default='./model/Resnet18.pth', help="path to net (to continue training)") #恢復訓練時的模型路徑
args = parser.parse_args()
# 超引數設定
EPOCH = 200 #遍歷資料集次數
pre_epoch = 0 # 定義已經遍歷資料集的次數
BATCH_SIZE = 128 #批處理尺寸(batch_size)
LR = 0.001 #學習率
# 準備資料集并預處理
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4), #先四周填充0,在吧影像隨機裁剪成32*32
transforms.RandomHorizontalFlip(), #影像一半的概率翻轉,一半的概率不翻轉
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)), #R,G,B每層的歸一化用到的均值和方差
])
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])
trainset = torchvision.datasets.ImageFolder(root='E:\\CLFAR-10+pyqt5\\data\\train', transform=transform_train) #訓練資料集
trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2) #生成一個個batch進行批訓練,組成batch的時候順序打亂取
testset = torchvision.datasets.ImageFolder(root='E:\\CLFAR-10+pyqt5\\data\\test', transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=True, num_workers=2)
# Cifar-10的標簽
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
# 模型定義-ResNet
net = ResNet18().to(device)
# 定義損失函式和優化方式
criterion = nn.CrossEntropyLoss() #損失函式為交叉熵,多用于多分類問題
optimizer = optim.SGD(net.parameters(), lr=LR, momentum=0.9, weight_decay=5e-4) #優化方式為mini-batch momentum-SGD,并采用L2正則化(權重衰減)
# 訓練
if __name__ == "__main__":
best_acc = 85 #2 初始化best test accuracy
print("Start Training, Resnet-18!") # 定義遍歷資料集的次數
with open("acc.txt", "w") as f:
with open("log.txt", "w")as f2:
for epoch in range(pre_epoch, EPOCH):
print('\nEpoch: %d' % (epoch + 1))
net.train()
sum_loss = 0.0
correct = 0.0
total = 0.0
for i, data in enumerate(trainloader, 0):
# 準備資料
length = len(trainloader)
inputs, labels = data
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
# forward + backward
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# 每訓練1個batch列印一次loss和準確率
sum_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += predicted.eq(labels.data).cpu().sum()
print('[epoch:%d, iter:%d] Loss: %.03f | Acc: %.3f%% '
% (epoch + 1, (i + 1 + epoch * length), sum_loss / (i + 1), 100. * correct / total))
f2.write('%03d %05d |Loss: %.03f | Acc: %.3f%% '
% (epoch + 1, (i + 1 + epoch * length), sum_loss / (i + 1), 100. * correct / total))
f2.write('\n')
f2.flush()
# 每訓練完一個epoch測驗一下準確率
print("Waiting Test!")
with torch.no_grad():
correct = 0
total = 0
for data in testloader:
net.eval()
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
# 取得分最高的那個類 (outputs.data的索引號)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
# result = torch.floor_divide(correct, total)
# print('測驗分類準確率為:%.3f%%' % (100 * result))
acc = 100 * correct / total
print('測驗分類準確率為:%.3f%%' % (acc))
# 將每次測驗結果實時寫入acc.txt檔案中
print('Saving model......')
torch.save(net.state_dict(), '%s/net_%03d.pth' % (args.outf, epoch + 1))
f.write("EPOCH=%03d,Accuracy= %.3f%%" % (epoch + 1, acc))
f.write('\n')
f.flush()
# 記錄最佳測驗分類準確率并寫入best_acc.txt檔案中
if acc > best_acc:
f3 = open("best_acc.txt", "w")
f3.write("EPOCH=%d,best_acc= %.3f%%" % (epoch + 1, acc))
f3.close()
best_acc = acc
print("Training Finished, TotalEPOCH=%d" % EPOCH)
將訓練程序記錄在 log.txt中,將每個epoch的測驗精度放在acc.txt中,最后通過if陳述句將最高精度記錄在best_acc.txt中,best_acc.txt中保存的是最高測驗準確率所對應的epoch,每次epoch的權重保存在model檔案夾下

三、新建predict檔案
為了讓模型和PyQt5結合,寫個預測腳本方便GUI檔案呼叫
代碼如下:
import torch
import torchvision.transforms as transforms
from resnet import ResNet18
from PIL import Image
def predict_(img):
data_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])
#img =Image.open('E:\CLFAR-10+pyqt5\4.jpg')
img = data_transform(img)
img = torch.unsqueeze(img, dim=0)
model = ResNet18()
model_weight_pth = 'E:\\CLFAR-10+pyqt5\\model\\net_200.pth'
model.load_state_dict(torch.load(model_weight_pth))
model.eval()
classes = {'0': '飛機', '1': '汽車', '2': '鳥', '3': '貓', '4': '鹿', '5': '狗', '6': '青蛙', '7': '馬', '8': '船', '9': '卡車'}
with torch.no_grad():
output = torch.squeeze(model(img))
print(output)
predict = torch.softmax(output, dim=0)
predict_cla = torch.argmax(predict).numpy()
return classes[str(predict_cla)], predict[predict_cla].item()
在上述訓練程序完成后,通過查看best_acc.txt查看測驗精度最好的一次所對應的epoch,在預測腳本中使用精度最高的epoch所對應的權重
model_weight_pth = 'E:\\CLFAR-10+pyqt5\\model\\net_200.pth'
model.load_state_dict(torch.load(model_weight_pth))
接下來測驗一下預測代碼:
列印一下output
img = Image.open('E:\\CLFAR-10+pyqt5\data\\test\\bird\\25.jpg')
net = predict_(img)
print(net)
結果:
tensor([ 1.2775, -3.7718, 6.0837, -0.4484, -4.9533, 3.0170, -4.3821, 3.7511,
1.8174, -2.6302])
('鳥', 0.8564958572387695)
tensor([19.7340, -4.3800, -3.0140, -3.5426, -2.8213, -2.6680, -3.8995, -4.8666,
4.2137, 0.3724])
Process finished with exit code 0
預測正確!
四、新建GUI.py檔案
這里就是建立界面了,代碼如下:
from PyQt5.QtWidgets import (QWidget,QLCDNumber,QSlider,QMainWindow,
QGridLayout,QApplication,QPushButton, QLabel, QLineEdit)
from PyQt5.QtGui import *
from PyQt5.QtCore import *
from PyQt5.QtWidgets import *
import sys
from PyQt5.QtCore import Qt
from predict import predict_
from PIL import Image
class Ui_example(QWidget):
def __init__(self):
super().__init__()
self.layout = QGridLayout(self)
self.label_image = QLabel(self)
self.label_predict_result = QLabel('識別結果',self)
self.label_predict_result_display = QLabel(self)
self.label_predict_acc = QLabel('識別準確率',self)
self.label_predict_acc_display = QLabel(self)
self.button_search_image = QPushButton('選擇圖片',self)
self.button_run = QPushButton('運行',self)
self.setLayout(self.layout)
self.initUi()
def initUi(self):
self.layout.addWidget(self.label_image,1,1,3,2)
self.layout.addWidget(self.button_search_image,1,3,1,2)
self.layout.addWidget(self.button_run,3,3,1,2)
self.layout.addWidget(self.label_predict_result,4,3,1,1)
self.layout.addWidget(self.label_predict_result_display,4,4,1,1)
self.layout.addWidget(self.label_predict_acc,5,3,1,1)
self.layout.addWidget(self.label_predict_acc_display,5,4,1,1)
self.button_search_image.clicked.connect(self.openimage)
self.button_run.clicked.connect(self.run)
self.setGeometry(300,300,300,300)
self.setWindowTitle('CLFAR-10十分類')
self.show()
def openimage(self):
global fname
imgName, imgType = QFileDialog.getOpenFileName(self, "選擇圖片", "", "*.jpg;;*.png;;All Files(*)")
jpg = QPixmap(imgName).scaled(self.label_image.width(), self.label_image.height())
self.label_image.setPixmap(jpg)
fname = imgName
def run(self):
global fname
file_name = str(fname)
img = Image.open(file_name)
a, b = predict_(img)
self.label_predict_result_display.setText(a)
self.label_predict_acc_display.setText(str(b))
if __name__ == '__main__':
app = QApplication(sys.argv)
ex = Ui_example()
sys.exit(app.exec_())
結果演示


遇到的問題
我學習中遇到的一些問題,通過百度和博客解決了,
1》nn.Sequential(*layers)為什么需要加一個星號?
答:如果星號加在了是實參上,代表的是將輸入迭代器拆成一個個元素,
2》net.train()和net.eval()區別?
答:使用PyTorch進行訓練和測驗時一定注意要把實體化的模型指定train/eval,eval()時,框架會自動把BN和DropOut固定住,不會取平均,而是用訓練好的值,不然的話,一旦test的batch_size過小,很容易就會被BN層導致生成圖片顏色失真極大,原因就是對于BN層來說,它在訓練程序中,是對每一個batch去一個樣本均值和方差,然后使用滑動指數平均所有的batch的均值和方差來近似整個樣本的均值和方差,對于測驗階段,我們固定我們樣本和方差,bn相當于一個線性的映射關系,所以說對于pytorch來說,在訓練階段我們net.train相當于打開滑動指數平均按鈕,不斷的更新;測驗階段我們關閉它,相當于一個線性映射關系,
3》correct += predicted.eq(labels.data).cpu().sum()是什么意思?
答:correct += predicted.eq(labels.data).cpu().sum()其實和correct += (predicted == labels).sum().item()是一個意思,.item()回傳的是一個具體值,而.data回傳的是一個tensor,要注意item()不能丟,不然回傳的是tensor,而tensor不能相加,
如果對大家的學習有所幫助,希望大家幫我點個贊,讓我覺得我的分享是有價值的,也歡迎大家和我交流
總結和參考
這篇文章大概算是我兩個月來初學pytorch的總結,后面大概要去看tensorflow了,
參考文章:
1.Pytorch實戰2:ResNet-18實作Cifar-10影像分類(測驗集分類準確率95.170%)
2.PYQT5+Pytorch的貓狗分類(從資料集制作->網路模型搭建和訓練->界面演示)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/204980.html
標籤:其他
上一篇:Mybatis自定義插件實戰以及與Spring整合原理
下一篇:注冊碼
