小熊飛槳練習冊-03石頭剪刀布

簡介

小熊飛槳練習冊-03石頭剪刀布，本專案開發和測驗均在 Ubuntu 20.04 系統下進行，
專案最新代碼查看主頁：小熊飛槳練習冊
百度飛槳 AI Studio 主頁：小熊飛槳練習冊-03石頭剪刀布
Ubuntu 系統安裝 CUDA 參考：Ubuntu 百度飛槳和 CUDA 的安裝

檔案說明

檔案	說明
train.py	訓練程式
test.py	測驗程式
test-gtk.py	測驗程式 GTK 界面
report.py	報表程式
get-data.sh	獲取資料到 dataset 目錄下
make-images-labels.py	生成影像路徑和標簽的文本檔案
check-data.sh	檢查 dataset 目錄下的資料是否存在
mod/VGG.py	VGG 網路模型
mod/dataset.py	ImageClass 影像分類資料集決議
mod/utils.py	雜項
mod/config.py	配置
mod/report.py	結果報表
dataset	資料集目錄
params	模型引數保存目錄
log	VisualDL 日志保存目錄

資料集

資料集來源于百度飛槳公共資料集：石頭剪刀布

獲取資料

如果運行在本地計算機，下載完資料，檔案放到 dataset 目錄下，在專案目錄下運行下面腳本，
如果運行在百度 AI Studio 環境，查看 data 目錄是否有資料，在專案目錄下運行下面腳本，

bash get-data.sh

生成影像路徑和標簽的文本檔案

獲取資料后，在專案目錄下運行下面腳本，生成影像路徑和標簽的文本檔案，包含：

訓練集 train-images-labels.txt
測驗集 test-images-labels.txt

python3 make-images-labels.py ./dataset rps-cv-images/rock 0 rps-cv-images/scissors 1 rps-cv-images/paper 2

分類標簽

石頭 0
剪子 1
布 2

檢查資料

獲取資料完畢后，在專案目錄下運行下面腳本，檢查 dataset 目錄下的資料是否存在，

bash check-data.sh

網路模型

網路模型使用 VGG 網路模型 來源百度飛槳教程和網路，
VGG 網路模型 參考：百度飛槳教程

import paddle
import paddle.nn as nn
import paddle.nn.functional as F


# VGG 網路模型
class VGG(nn.Layer):
    """
    VGG 網路模型

    輸入影像大小為 224 x 224
    """

    def __init__(self, num_classes=10, fc1_in_features=25088):
        """
        VGG 網路模型

        Args:
            num_classes (int, optional): 分類數量, 默認 10
            fc1_in_features (int, optional): 第一層全連接層輸入特征數量, 默認 25088, 
                根據 max_pool5 輸出結果, 計算得出 512*7*7 = 25088

        Raises:
            Exception: 分類數量 num_classes 必須大于等于 2
        """
        super(VGG, self).__init__()
        if num_classes < 2:
            raise Exception(
                "分類數量 num_classes 必須大于等于 2: {}".format(num_classes))
        self.num_classes = num_classes
        self.fc1_in_features = fc1_in_features

        # 處理塊 1
        self.conv1_1 = nn.Conv2D(
            in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv1_2 = nn.Conv2D(
            in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.max_pool1 = nn.MaxPool2D(kernel_size=2, stride=2)

        # 處理塊 2
        self.conv2_1 = nn.Conv2D(
            in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.conv2_2 = nn.Conv2D(
            in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.max_pool2 = nn.MaxPool2D(kernel_size=2, stride=2)

        # 處理塊 3
        self.conv3_1 = nn.Conv2D(
            in_channels=128, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv3_2 = nn.Conv2D(
            in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv3_3 = nn.Conv2D(
            in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.max_pool3 = nn.MaxPool2D(kernel_size=2, stride=2)

        # 處理塊 4
        self.conv4_1 = nn.Conv2D(
            in_channels=256, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv4_2 = nn.Conv2D(
            in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv4_3 = nn.Conv2D(
            in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.max_pool4 = nn.MaxPool2D(kernel_size=2, stride=2)

        # 處理塊 5
        self.conv5_1 = nn.Conv2D(
            in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv5_2 = nn.Conv2D(
            in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv5_3 = nn.Conv2D(
            in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.max_pool5 = nn.MaxPool2D(kernel_size=2, stride=2)

        # 全連接層 in_features 25088 = max_pool5 輸出 512*7*7
        self.fc1 = nn.Linear(in_features=fc1_in_features, out_features=4096)
        self.drop_ratio1 = 0.5
        self.drop1 = nn.Dropout(self.drop_ratio1)
        self.fc2 = nn.Linear(in_features=4096, out_features=4096)
        self.drop_ratio2 = 0.5
        self.drop2 = nn.Dropout(self.drop_ratio2)
        self.fc3 = nn.Linear(in_features=4096, out_features=num_classes)

    def forward(self, x):
        # 處理塊 1
        x = self.conv1_1(x)
        x = F.relu(x)
        x = self.conv1_2(x)
        x = F.relu(x)
        x = self.max_pool1(x)

        # 處理塊 2
        x = self.conv2_1(x)
        x = F.relu(x)
        x = self.conv2_2(x)
        x = F.relu(x)
        x = self.max_pool2(x)

        # 處理塊 3
        x = self.conv3_1(x)
        x = F.relu(x)
        x = self.conv3_2(x)
        x = F.relu(x)
        x = self.conv3_3(x)
        x = F.relu(x)
        x = self.max_pool3(x)

        # 處理塊 4
        x = self.conv4_1(x)
        x = F.relu(x)
        x = self.conv4_2(x)
        x = F.relu(x)
        x = self.conv4_3(x)
        x = F.relu(x)
        x = self.max_pool4(x)

        # 處理塊 5
        x = self.conv5_1(x)
        x = F.relu(x)
        x = self.conv5_2(x)
        x = F.relu(x)
        x = self.conv5_3(x)
        x = F.relu(x)
        x = self.max_pool5(x)

        # 全連接層
        # flatten 根據給定的 start_axis 和 stop_axis 將連續的維度展平
        x = paddle.flatten(x, start_axis=1, stop_axis=-1)
        x = self.fc1(x)
        x = F.relu(x)
        # 在全連接之后使用 dropout 抑制過擬合
        x = self.drop1(x)
        x = self.fc2(x)
        x = F.relu(x)
        # 在全連接之后使用 dropout 抑制過擬合
        x = self.drop2(x)
        x = self.fc3(x)

        return x

資料集決議

資料集決議，主要是決議 影像路徑和標簽的文本 ，然后根據影像路徑讀取影像和標簽，

import paddle
import os
import random
import numpy as np
from PIL import Image
import paddle.vision as ppvs


class ImageClass(paddle.io.Dataset):
    """
    ImageClass 影像分類資料集決議, 繼承 paddle.io.Dataset 類
    """

    def __init__(self,
                 dataset_path: str,
                 images_labels_txt_path: str,
                 transform=None,
                 shuffle=True
                 ):
        """
        建構式，定義資料集

        Args:
            dataset_path (str): 資料集路徑
            images_labels_txt_path (str): 影像和標簽的文本路徑
            transform (Compose, optional): 轉換資料的操作組合, 默認 None
            shuffle (bool, True): 隨機打亂資料, 默認 True
        """

        super(ImageClass, self).__init__()
        self.dataset_path = dataset_path
        self.images_labels_txt_path = images_labels_txt_path
        self._check_path(dataset_path, "資料集路徑錯誤")
        self._check_path(images_labels_txt_path, "影像和標簽的文本路徑錯誤")
        self.transform = transform
        self.image_paths, self.labels = self.parse_dataset(
            dataset_path, images_labels_txt_path, shuffle)

    def __getitem__(self, idx):
        """
        獲取單個資料和標簽

        Args:
            idx (Any): 索引

        Returns:
            image (float32): 影像
            label (int): 標簽
        """
        image_path, label = self.image_paths[idx], self.labels[idx]
        return self.get_item(image_path, label, self.transform)

    @staticmethod
    def get_item(image_path: str, label: int, transform=None):
        """
        獲取單個資料和標簽

        Args:
            image_path (str): 影像路徑
            label (int): 標簽
            transform (Compose, optional): 轉換資料的操作組合, 默認 None

        Returns:
            image (float32): 影像
            label (int): 標簽
        """
        ppvs.set_image_backend("pil")
        image = Image.open(image_path)
        if transform is not None:
            image = transform(image)
        # 轉換影像 HWC 轉為 CHW
        image = np.transpose(image, (2, 0, 1))
        return image.astype("float32"), label

    def __len__(self):
        """
        資料數量

        Returns:
            int: 資料數量
        """
        return len(self.labels)

    def _check_path(self, path: str, msg: str):
        """
        檢查路徑是否存在

        Args:
            path (str): 路徑
            msg (str, optional): 例外訊息

        Raises:
            Exception: 路徑錯誤, 例外
        """
        if not os.path.exists(path):
            raise Exception("{}: {}".format(msg, path))

    @staticmethod
    def parse_dataset(dataset_path: str, images_labels_txt_path: str, shuffle: bool):
        """
        資料集決議

        Args:
            dataset_path (str): 資料集路徑
            images_labels_txt_path (str): 影像和標簽的文本路徑

        Returns:
            image_paths: 影像路徑集
            labels: 分類標簽集
        """
        lines = []
        image_paths = []
        labels = []
        with open(images_labels_txt_path, "r") as f:
            lines = f.readlines()
        # 隨機打亂資料
        if (shuffle):
            random.shuffle(lines)
        for i in lines:
            data = https://www.cnblogs.com/cnhemiya/archive/2022/04/19/i.split(" ")
            image_paths.append(os.path.join(dataset_path, data[0]))
            labels.append(int(data[1]))
        return image_paths, labels

配置模塊

可以查看修改 mod/config.py 檔案，有詳細的說明

開始訓練

運行 train.py 檔案，查看命令列引數加 -h

python3 train.py

  --cpu             是否使用 cpu 計算，默認使用 CUDA
  --learning-rate   學習率，默認 0.001
  --epochs          訓練幾輪，默認 2 輪
  --batch-size      一批次數量，默認 2
  --num-workers     執行緒數量，默認 2
  --no-save         是否保存模型引數，默認保存, 選擇后不保存模型引數
  --load-dir        讀取模型引數，讀取 params 目錄下的子檔案夾, 默認不讀取
  --log             是否輸出 VisualDL 日志，默認不輸出
  --summary         輸出網路模型資訊，默認不輸出，選擇后只輸出資訊，不會開啟訓練

測驗模型

運行 test.py 檔案，查看命令列引數加 -h

python3 test.py

  --cpu           是否使用 cpu 計算，默認使用 CUDA
  --batch-size    一批次數量，默認 2
  --num-workers   執行緒數量，默認 2
  --load-dir      讀取模型引數，讀取 params 目錄下的子檔案夾, 默認 best 目錄

測驗模型 GTK 界面

運行 test-gtk.py 檔案，此程式依賴 GTK 庫，只能運行在本地計算機，

python3 test-gtk.py

GTK 庫安裝

python3 -m pip install pygobject

使用手冊

1、點擊 選擇模型 按鈕，
2、彈出的檔案對話框選擇模型，模型在 params 目錄下的子目錄的 model.pdparams 檔案，
3、點擊 隨機測驗 按鈕，就可以看到測驗的影像，預測結果和實際結果，

查看結果報表

運行 report.py 檔案，可以顯示 params 目錄下所有子目錄的 report.json，
然后根據 loss 最小的模型引數保存在 best 子目錄下，

python3 report.py

report.json 說明

鍵名	說明
id	根據時間生成的字串 ID
loss	本次訓練的 loss 值
acc	本次訓練的 acc 值
epochs	本次訓練的 epochs 值
batch_size	本次訓練的 batch_size 值
learning_rate	本次訓練的 learning_rate 值

VisualDL 可視化分析工具

安裝和使用說明參考：VisualDL
訓練的時候加上引數 --log
如果是 AI Studio 環境訓練的把 log 目錄下載下來，解壓縮后放到本地專案目錄下 log 目錄
在專案目錄下運行下面命令
然后根據提示的網址，打開瀏覽器訪問提示的網址即可

visualdl --logdir ./log

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/458673.html

標籤：其他

上一篇：Visual Studio 2019設定PCL 1.12.1環境

下一篇：k8s入門之pod(四)