有人說人工智能會是繼互聯網之后的下一次工業革命，不可否認，大到汽車、小到手表，AI技術已經廣泛應用在我們周圍，隨便一個APP都試圖跟AI發生點關系以證明自己的與時俱進，

AI的普及為客戶端開發帶來了挑戰，同時也是機遇，挑戰在于客戶端技術相對于AI等新興技術正被逐漸邊緣化；機遇在于移動設備仍將長期作為連接AI與用戶的重要載體，隨著硬體能力逐升級，移動端甚至可以自主完成一些機器學習和推理，讓用戶更快捷地享受到AI的帶來的便利，

邊緣計算（Edge AI）的意義

“模型” 是對一組訓練資料應用機器學習演算法而得到的結果，使用模型對一些輸入的資料進行預測的程序叫“推理”，有很多依靠撰寫代碼僅能低效甚至很難完成的任務，使用模型推理能更好地完成，例如，可以訓練模型來歸類照片，或者識別照片內的特定物件等，

長久以來，模型推理大多運行在服務端，客戶端只作為結果展示的載體，但隨著移動端的硬體性能越來越高，很多深度學習的資料模型可以以二進制形式下載到手機上，基于這些模型可以進行端上的AI處理，“邊緣計算”的概念由此誕生，邊緣計算有以下好處

資料本地化，解決云端存盤及隱私問題；
計算本地化，解決云端計算過載問題；
低通信成本，解決互動和體驗問題；
去中心化計算，故障規避與極致個性化，

邊緣計算的技術現狀

嚴格來說，邊緣AI（Edge AI）是邊緣計算的一個方向，邊緣AI等能力主要借助移動端的機器學習（Machine Learning，后文簡稱ML）庫實作，

移動端的KL庫主要服務于模型下發和移動端推理引擎的執行，由于移動端的計算能力相對較弱，模型檔案不宜過大，需要進行一些裁剪和壓縮，目前主流的ML庫有Tensorflow Lite、PyTorch Mobile、MediaPipe、Firebase ML Kit等，本文就這些技術做一個簡單的介紹，幫大家擴大技術視野，

TensorFlow Lite

https://www.tensorflow.org/lite/guide

TensorFlow Lite 是將 TensorFlow 用于移動設備和嵌入式設備的輕量級解決方案，可以在Android、iOS以及其他嵌入式系統上使用，借由 TensorFlow Lite Converter 將模型轉化的壓縮后的格式.tflite，TFLite Converter 提供 Python和 CLI 工具，推薦使用Python API，

下面代碼是通過 TFLite Converter 將 tensorflow 和 tf.keras 的模型檔案轉換為 TFLite 格式的模型

import tensorflow as tf

# 轉換 saved_model
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

# 轉換 keras_model
keras_model = tf.keras.models.load_model(filepath)
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
keras_tflite_model = converter.convert()
with open('keras_model.tflite', 'wb') as f:
    f.write(keras_tflite_model)

轉換后的模型檔案可以通過 TFLite 下載到手機上使用，以 Android 為例，代碼如下：

class TFLiteActivity : AppCompatActivity() {
    /* 模型下載 */
    private fun initializeTFLite(device: String = "NNAPI", numThreads: Int = 4) {
        val delegate = when (device) {
            "NNAPI" -> NnApiDelegate()
            "GPU" -> GpuDelegate()
            "CPU" -> "" }
        if (delegate != "") tfliteOptions.addDelegate(delegate)

        tfliteOptions.setNumThreads(numThreads)
        tfliteModel = FileUtil.loadMappedFile(this, tflite_model_path)
        tfliteInterpreter = Interpreter(tfliteModel, tfliteOptions)
        inputImageBuffer = TensorImage(tfliteInterpreter.getInputTensor(0).dataType())
        outputProbabilityBuffer = TensorBuffer.createFixedSize(
            tfliteInterpreter.getOutputTensor(0).shape(),
            tfliteInterpreter.getInputTensor(0).dataType())

        probabilityProcessor = TensorProcessor
            .Builder()
            .add(NormalizeOp(0.0f, 1.0f))
            .build()
    }

    /* 處理處理 */
    @WorkerThread
    override fun analyzeImage(image: ImageProxy, rotationDegrees: Int): Map<String, Float> {
        val bitmap = Utils.imageToBitmap(image)
        val cropSize = Math.min(bitmap.width, bitmap.height)
        inputImageBuffer.load(bitmap)
        val inputImage = ImageProcessor
            .Builder()
            .add(ResizeWithCropOrPadOp(cropSize, cropSize))
            .add(ResizeOp(224, 224, ResizeMethod.NEAREST_NEIGHBOR))
            .add(NormalizeOp(127.5f, 127.5f))
            .build()
            .process(inputImageBuffer)

        tfliteInterpreter.run(inputImage!!.buffer, outputProbabilityBuffer.buffer.rewind())
        val labeledProbability: Map<String, Float> = TensorLabel(
            labelsList, probabilityProcessor.process(outputProbabilityBuffer)
        ).mapWithFloatValue
        return labeledProbability
    }
}

TFLite 的 API 非常易用，即使沒有太多的客戶端開發經驗也可以駕馭，

PyTorch Mobile

https://pytorch.org/mobile/home/

Facebook 于 19 年底發布了 PyTorch Mobile，以 PyTorch 的移動端解決方案，PyTorch Mobile將 Pytorchscript 的模型進行 JIT 編譯為 .pt 格式的檔案，2020年，PyTorch Developer Day 宣布開始支持 Android 的 NNAPI 和 iOS 的 MetalAPI，

PyTorch 中使用 torch.jit.trace 處理模型轉換

import torch
import torchvision

model = torchvision.models.resnet18(pretrained=True)
model.eval()
example = torch.rand(1, 3, 224, 224)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("model.pt")

經轉換的模型可以在 Android 和 iOS 中加載，以 Android 為例，代碼如下：

class PyTorchActivity : AppCompatActivity() {
    /* 模型下載 */
    private fun initializePyTorch() {
        val pytorchModule = Module.load(Utils.assetFilePath(
            this,
            pytorch_mobile_model_path))
        val mInputTensorBuffer = Tensor.allocateFloatBuffer(3 * 224 * 224)
        val mInputTensor = Tensor.fromBlob(
            mInputTensorBuffer,
            longArrayOf(1, 3, 224L, 224L)
        )
    }

    /* 模型處理 */
    @WorkerThread
    override fun analyzeImage(image: ImageProxy, rotationDegrees: Int): Map<String, Float> {
        TensorImageUtils.imageYUV420CenterCropToFloatBuffer(
            image.image,
            rotationDegrees,
            224,
            224,
            TensorImageUtils.TORCHVISION_NORM_MEAN_RGB,
            TensorImageUtils.TORCHVISION_NORM_STD_RGB,
            mInputTensorBuffer,
            0
        )
        val outputModule = pytorchModule.forward(IValue.from(mInputTensor)).toTensor()
        val scores = outputModule.dataAsFloatArray
        val labeledProbability: MutableMap<String, Float> = mutableMapOf()
        for (i in 0 until labelsList.size - 1) {
            labeledProbability[labelsList[i + 1]] = score[i]
        }
        return labeledProbability
    }
}

MediaPipe

https://google.github.io/mediapipe/

MediaPipe 與 Tensorflow Lite 和 PyTorch Mobile 不同，并不是從已有的深度學習庫派生出來的，MediaPipe 專注于計算機視覺和多媒體處理的 ML管道框架，在2019年6月舉行的 CVPR 大會，MeidaPipe 正式開源，版本是v0.5.0，自那以后，谷歌陸續發布了一系列的ML管道示例，MediaPipe 為 Android 、 iOS 等多平臺提供了人臉、物體檢測、動作捕捉等能力，

MediaPipe 圖形庫可以通過 bazel 編譯成供 Androd端使用的 .aar 或者 iOS 的 .ipa

# MediaPipe graph that performs face mesh with TensorFlow Lite on GPU.

# GPU buffer. (GpuBuffer)
input_stream: "input_video"

# Output image with rendered results. (GpuBuffer)
output_stream: "output_video"
# Detected faces. (std::vector<Detection>)
output_stream: "face_detections"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered, and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped, limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively, which leads to increased latency and memory usage, unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,
# e.g., the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing previous inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Subgraph that detects faces.
node {
  calculator: "FaceDetectionFrontGpu"
  input_stream: "IMAGE:throttled_input_video"
  output_stream: "DETECTIONS:face_detections"
}

# Converts the detections to drawing primitives for annotation overlay.
node {
  calculator: "DetectionsToRenderDataCalculator"
  input_stream: "DETECTIONS:face_detections"
  output_stream: "RENDER_DATA:render_data"
  node_options: {
    [type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] {
      thickness: 4.0
      color { r: 255 g: 0 b: 0 }
    }
  }
}

# Draws annotations and overlays them on top of the input images.
node {
  calculator: "AnnotationOverlayCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  input_stream: "render_data"
  output_stream: "IMAGE_GPU:output_video"
}

上面是人臉識別的處理程序，經歷了影像輸入、人臉檢測、特征點繪制等一系列程序，這也正好是 MediaPipe 的特點，不只是簡單的ML推理，而是可以將輸入、前處理、推理、后處理，輸出等一系列流程進行組合編排，

Firebase ML Kit

https://firebase.google.com/docs/ml

Firebase 由 Google 提供支持，基于 Google Mobile Service 的移動開放平臺，為移動開發這提供了 APM、埋點等功能，ML Kit 是 Firebase 功能的面向移動端的機器學習庫，面向 Android/iOS 提供模型的分發、推理、學習、日志收集等能力，目前只支持 Tensorflow Lite 格式的模型，

例如使用 ML Kit 識別圖片中的物體，代碼如下：

private class ObjectDetection : ImageAnalysis.Analyzer {
    val options = FirebaseVisionObjectDetectorOptions.Builder()
        .setDetectorMode(FirebaseVisionObjectDetectorOptions.STREAM_MODE)
        .enableClassification()
        .build()
    val objectDetector = FirebaseVision.getInstance().getOnDeviceObjectDetector(options)

    private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
        0 -> FirebaseVisionImageMetadata.ROTATION_0
        90 -> FirebaseVisionImageMetadata.ROTATION_90
        180 -> FirebaseVisionImageMetadata.ROTATION_180
        270 -> FirebaseVisionImageMetadata.ROTATION_270
        else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
    }

    override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
        val mediaImage = imageProxy?.image
        val imageRotation = degreesToFirebaseRotation(degrees)
        if (mediaImage != null) {
            val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
            objectDetector.processImage(image)
                    .addOnSuccessListener { detectedObjects ->
                        for (obj in detectedObjects) {
                            val id = obj.trackingId
                            val bounds = obj.boundingBox
                            val category = obj.classificationCategory
                            val confidence = obj.classificationConfidence
                            // Do Something
                        }
                    }
                    .addOnFailureListener { e ->
                        // Do Something
                    }
        }
    }
}

上面介紹的ML庫都需要下發推理引擎到客戶端，其實 Android、iOS 也有自帶的推理引擎

iOS (Core ML)

https://developer.apple.com/cn/documentation/coreml/
iOS提供了

iOS 提供 CoreML ，可以將各種機器學習模型集成到應用中，并進行推理，CoreML不僅支持TFLite格式的模型，還支持 ONNX、Pytorch、XGBoost、Scikit-learn 等其他模型格式，這些模型通過 coremltools 轉換為 CoreML 專用格式后加載到本地，iPhone 搭載了專用的神經網路處理器，可以低功耗地進行模型推理，想用iOS進行機器學習的話，CoreML是一個好選擇，

Android (NNAPI)

https://developer.android.com/ndk/guides/neuralnetworks

Android 端提供了 NNAPI（Android Neural Networks API）用于模型推理，NNAPI 是 Android 8.1（API等級27）以后提供的專門處理機械學習的 Native 庫，NNAPI 會根據手機當前的硬體性能、負荷狀況等，將處理跑在特定設備上（GPU、DSP、專用處理器），當然也可以統一交由 CPU 執行，

Web端

Tensorflow.js

https://www.tensorflow.org/js

瀏覽器也可以進行機器學習和推理，Web瀏覽器中進行AI計算的主要語言是Javascript，使用的ML庫是Tensorflow.js，學習和推理都要在瀏覽器執行，相對于Android 、 iOS 來說性能不找優勢，但是通過 WebGL 和 WASM 的輔助，也可以相對高效地完成計算，滿足Web端需求

tf.keras 和 saved model 可以轉換為 tensorflow.js 可處理的json格式，其中包含了神經網路結構和權重，整個轉換通過 tensorflowjs_converter 進行

# saved_model轉換
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --output_node_names='MobilenetV1/Predictions/Reshape_1' \
    --saved_model_tags=serve \
    /mobilenet/saved_model \
    /mobilenet/web_model

# Keras_model変換
tensorflowjs_converter \
    --input_format keras \
    path/to/my_model.h5 \
    path/to/tfjs_target_dir

tensorflow.js 可以嵌入到 html 中使用，代碼如下：

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.1"> </script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/mobilenet@1.0.0"> </script>

<img id="img" src="cat.jpg"></img>

<script>
  const img = document.getElementById('img');
  // Load the model.
  mobilenet.load().then(model => {
    // Classify the image.
    model.classify(img).then(predictions => {
      console.log('Predictions: ');
      console.log(predictions);
    });
  });
</script>

當然也可以在js中使用

import * as tf from "@tensorflow/tfjs";

import { IMAGENET_CLASSES } from "./imagenet_classes";

const MOBILENET_MODEL_PATH =
  "https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_0.25_224/model.json";

const IMAGE_SIZE = 224;
const TOPK_PREDICTIONS = 10;

let mobilenet;
const mobilenetDemo = async () => {
  status("Loading model...");

  mobilenet = await tf.loadLayersModel(MOBILENET_MODEL_PATH);
  mobilenet.predict(tf.zeros([1, IMAGE_SIZE, IMAGE_SIZE, 3])).dispose();

  status("");

  const catElement = document.getElementById("cat");
  if (catElement.complete && catElement.naturalHeight !== 0) {
    predict(catElement);
    catElement.style.display = "";
  } else {
    catElement.onload = () => {
      predict(catElement);
      catElement.style.display = "";
    };
  }
};

async function predict(imgElement) {
  status("Predicting...");
  const logits = tf.tidy(() => {
    const img = tf.browser.fromPixels(imgElement).toFloat();

    const offset = tf.scalar(127.5);

    const normalized = img.sub(offset).div(offset);

    const batched = normalized.reshape([1, IMAGE_SIZE, IMAGE_SIZE, 3]);

    return mobilenet.predict(batched);
  });
}

ml5.js

https://learn.ml5js.org/#/

ml5.js 是對 tensorflow.js 的封裝，它的 API 更加簡單易懂，適合機器學習初學者使用，在ml5.js中提供了影像、語言、聲音等媒體中頻繁使用的分類、轉換API，API風格更加符合 Javascript 的習慣，

下面是使用 ml5.js 進行推理的代碼示例：

let classifier;

let img;

function preload() {
  classifier = ml5.imageClassifier("MobileNet");
  img = loadImage("images/bird.png");
}

function setup() {
  createCanvas(400, 400);
  classifier.classify(img, gotResult);
  image(img, 0, 0);
}

function gotResult(error, results) {
  if (error) {
    console.error(error);
  } else {
    console.log(results);
    createDiv(`Label: ${results[0].label}`);
    createDiv(`Confidence: ${nf(results[0].confidence, 0, 2)}`);
  }
}

模型壓縮

前文提到過，模型壓縮更利于端上進行推理，最后介紹幾種常見的模型壓縮方法：

引數量化（Parameter Quantization）
網路剪枝（Network Pruning）
知識蒸餾（Knowledge Distillation）

量化（Quantization）

量化就是使用更少的bits來表示一個引數，例如在創建模型的時候使用32位浮點數進行學習，在進行推理之前，將其轉化為16位、8位、甚至1位（boolean），通過減少模型體積降低了計算量，當然量化會導致精度的劣化，有可能使計算產生偏差，這也是一種面向性能的妥協，

剪枝（Pruning）

神經網路中的一些冗余的權重和神經元是可以被剪枝的，因為這些權重較低或者神經元的輸出大多數時候為零，通過洗掉這些內容可以減輕模型的重量，另外通過共享的手段，在多個節點之間共享權重，也可以減少模型的容量，跟量化一樣，洗掉一些權重可能導致精讀劣化，

蒸餾（Distillation）

蒸餾是通過在學習方法上下功夫來提高壓縮模型精度的方法，

蒸餾現以高精度的大容量模型進行學習，以大容量模型的推理結果作為特征參考，參與輕量模型的計算中，以提高準確率，由于大容量模型的推理結果是一個標簽的概率分布，所以輕量模型從標簽的概率中學習各個資料標簽的相似性，例如，有一張貓的畫像，分布在貓60%、狗30%、兔子10%的情況下，貓的影像是貓：狗：兔子=6:3:1的特征，

總結

我們在學習計算機組成原理的時候知道一個常識是，CPU 的運力是過剩的，但硬體的性能瓶頸受制于記憶體和主線的速度，如果把 Cloud 比作CPU，那么連接云/端的網路就是記憶體和總線，4G網路條件下，云/端巨大的通信開銷使得計算的實時性較差，即使到了5G時代，傳輸速度雖然大大提升，但是過多的鏈接又會造成“CPU”的高負荷，因此人們急需找到一種適合云端協同環境下的AI處理方案，

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/294622.html

標籤：AI

上一篇：資料挖掘與機器學習與深度學習的關聯

下一篇：Visual Studio 2019 配置 opencv 最簡單教程

邊緣計算：客戶端 + 人工智能