使用 C++ 部署深度學習模型快速上手方案-有解無憂

本文將從獲取一個訓練好的 shufflenet_v2 模型出發，講解如何使用 MegEngine Lite 的 C++ 介面將其部署到 CPU（Linux x86 / Android Arm）環境下運行，主要分為以下小節：

匯出已經訓練好的模型
撰寫 Inference 代碼
編譯 MegEngine Lite
編譯 Inference 代碼
執行 Inference 檔案，驗證結果

參見：

MegEngine Lite 還可以通過 Python 介面進行使用, 使用方便但有局限性，

匯出已經訓練好的模型

請參考獲得用于 MegEngine Lite 推理的模型，

撰寫 Inference 代碼

首先創建一個 main.cpp, 在這個檔案中將直接呼叫 MegEngine Lite 的介面運行 shufflenet_v2.mge 模型，輸入資料 input_tensor 是隨機生成的，所以不用在乎計算結果，

#include <iostream>
#include "lite/network.h"
using namespace lite;

int main(int argc, char** argv) {
    std::cout << " Usage: ./demo_deploy model_name" << std::endl;
    if (argc != 2) {
        std::cout << " Wrong argument" << std::endl;
        return 0;
    }

    std::string model_path = argv[1];

    //! create and load the network
    std::shared_ptr<lite::Network> network = std::make_shared<Network>();

    //! load the model
    network->load_model(model_path);

    //! get the input tensor of the network with name "data"
    std::shared_ptr<Tensor> input_tensor = network->get_io_tensor("data");

    //! fill the rand data to input tensor
    srand(static_cast<unsigned>(time(NULL)));
    size_t length =
            input_tensor->get_tensor_total_size_in_byte() / sizeof(float);
    float* in_data_ptr = static_cast<float*>(input_tensor->get_memory_ptr());
    for (size_t i = 0; i < length; i++) {
        in_data_ptr[i] =
                static_cast<float>(rand()) / (static_cast<float>(RAND_MAX));
    }

    //! forward
    network->forward();
    network->wait();

    //! get the inference output tensor of index 0
    std::shared_ptr<Tensor> output_tensor = network->get_output_tensor(0);
    float* predict_ptr = static_cast<float*>(output_tensor->get_memory_ptr());
    float sum = 0.0f, max = predict_ptr[0];
    for (size_t i = 0; i < 1000; i++) {
        sum += predict_ptr[i];
        if (predict_ptr[i] > max) {
            max = predict_ptr[i];
        }
    }
    std::cout << "The output SUM is " << sum << ", Max is " << max << std::endl;
}

上面代碼主要完成了幾個步驟，包括：

創建默認配置的 Network；
載入模型，MegEngine Lite 將讀取并決議模型檔案，并創建計算圖；
通過輸入 Tensor 的名字獲取模型的輸入 Tensor, 并設定亂數作為輸入資料；
執行 Inference 邏輯;
獲取模型輸出 Tensor, 并處理輸出資料，

至此完成了一個 shufflenet_v2 模型的推理程序的 C++ 代碼撰寫，

但在真正運行這段代碼之前，還需要編譯該 C++ 源檔案，并鏈接 MegEngine Lite 庫檔案， ?? ?? ??

編譯 MegEngine Lite

注解

這一步的目的是獲得 MegEngine Lite 的靜態鏈接庫和元件，供我們上面代碼編譯時候進行鏈接；編譯的程序和從原始碼編譯 MegEngine 中的介紹是一致的，
下面將演示在 Linux x86 下使用動態鏈接，Android Arm 上使用靜態鏈接的流程：

首先需要 Clone 整個 MegEngine 工程，并進入到 MegEngine 的根目錄：

git clone --depth=1 [email protected]:MegEngine/MegEngine.git
cd MegEngine

環境準備 & 執行編譯：

Linux x86

準備編譯依賴的子模塊：

./third_party/prepare.sh

安裝英特爾數學核心庫（MKL）:

./third_party/install-mkl.sh

本機編譯 MegEngine Lite:

scripts/cmake-build/host_build.sh

Android Arm

準備編譯依賴的子模塊：

./third_party/prepare.sh

從安卓官網下載 NDK 并解壓到某路徑，并將改路徑設定為 NDK_ROOT 環境變數：

export NDK_ROOT=/path/to/ndk

交叉編譯 MegEngine Lite:

scripts/cmake-build/cross_build_android_arm_inference.sh

編譯完成之后 MegEngine Lite 庫和頭檔案路徑 /path/to/megenginelite-lib

Linux x86: build_dir/host/MGE_WITH_CUDA_OFF/MGE_INFERENCE_ONLY_ON/Release/install/lite/
Android Arm: build_dir/android/arm64-v8a/Release/install/lite/

編譯 Inference 代碼

有了上一步得到的 MegEngine Lite 庫檔案，我們就可以在編譯 Inference 代碼的時候進行動態鏈接或靜態鏈接，下面分別用 Linux x86 和 Android Arm 來展示兩種鏈接方式，演示編譯 Inference 代碼的步驟：

Linux x86 動態鏈接編譯

根據自身環境選擇編譯器（這里使用的是 clang++, 也可以用 g++），動態鏈接 liblite_shared.so 檔案：

export LITE_INSTALL_DIR=/path/to/megenginelite-lib #上一步中編譯生成的庫檔案安裝路徑
export LD_LIBRARY_PATH=$LITE_INSTALL_DIR/lib/x86_64/:$LD_LIBRARY_PATH

clang++ -o demo_deploy \
  -I$LITE_INSTALL_DIR/include main.cpp \
  -llite_shared -L$LITE_INSTALL_DIR/lib/x86_64

編譯完成之后，就得到了可執行檔案 demo_deploy.

Android Arm 靜態鏈接編譯

Android Arm 編譯為交叉編譯（在 Linux 主機上編譯 Android Arm 中運行的可執行程式），

以鏈接 MegEngine Lite 的靜態庫作為示例，需要確保 NDK 環境準備完成，

export LITE_INSTALL_DIR=/path/to/megenginelite-lib #上一步中編譯生成的庫檔案安裝路徑
export PATH=${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/bin/:$PATH
export CXX=aarch64-linux-android21-clang++

${CXX} -llog -lz -s \
  -I${LITE_INSTALL_PATH}/include main.cpp \
  ${LITE_INSTALL_PATH}/lib/aarch64/liblite_static_all_in_one.a \
  -o demo_deploy

編譯完成之后，需要將 demo_deploy 和模型檔案 shufflenet_v2.mge 拷貝到 Android Arm 機器上，

執行 Inference 檔案，驗證結果

最后執行編譯好的檔案，就可以看到推理結果：

./demo_deploy shufflenet_v2.mge

這樣就快速完成了 X86 和 Arm 上簡單的 demo 部署，

在本例中，最后計算結果可以看到：經過 softmax 之后，輸出的結果中 sum = 1, 符合 softmax 的輸出特點，

附

GitHub：MegEngine 曠視天元（歡迎 star~

Gitee：MegEngine/MegEngine

MegEngine 官網：MegEngine-深度學習，簡單開發

歡迎加入 MegEngine 技術交流 QQ 群：1029741705

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/533537.html

標籤：其他

上一篇：MixGo CE主控板簡單介紹

下一篇：KubeSphere 社區雙周報 | KubeKey v3.0.0 發布 | 2022-11-10