天池熱身賽-布匹瑕疵目標檢測-有解無憂

1、檢測代碼

代碼源自datawhale官方提供baseline: https://github.com/datawhalechina/team-learning-cv/tree/master/DefectDetection

資料處理代碼都寫好了，太感動了 ^_

baseline使用的是yolov5，我的顯卡只有一個1080Ti，所以先選擇yolov5s進行訓練，設定訓練50個epoch，圖片大小設定為512x512，

這部分內容主要參考了https://blog.csdn.net/qq_26751117/article/details/113853150

資料處理：主要是將比賽方提供的資料格式轉化為yolo需要的格式，先使用convertTrainLabel.py轉化，然后在運行process_data_yolo.py，就得到了資料，存放位置為process_data檔案夾；注意需要修改process_data_yolo中val欄位，全部改為train欄位，運行兩次，分別得到驗證和訓練的資料檔案，
預訓練權重：嘗試了一下不加載預訓練權重，效果不是很好，可能是因為本來資料就比較少，還是需要進行遷移學習的，所以想辦法下載了yolov5s.pt檔案，進行了加載，由于模型比較小，可以設定較大的batch size, 這里是16，這里借上邊那個文章的圖，需要簡單修改一下加載權重的部分，

運行，簡單修改了一下一些報錯的點，然后就可以運行了yolov5s模型了，

2、docker提交

我的dockerfile檔案：

# Base Images
FROM registry.cn-shanghai.aliyuncs.com/tcc-public/pytorch:1.4-cuda10.1-py3

ADD . /workspace

WORKDIR /workspace

RUN pip install -r requirements.txt

CMD ["sh", "run.sh"]

開始構建：

(torch16) pdluser@pdluser-System-Product-Name:~/project/tianchi_demo$ sudo docker build -t registry.cn-shenzhen.aliyuncs.com/nine_percent/tianchi_submit:1.0 .
[sudo] pdluser 的密碼：
Sending build context to Docker daemon  6.778GB
Step 1/5 : FROM registry.cn-shanghai.aliyuncs.com/tcc-public/pytorch:1.4-cuda10.1-py3
 ---> 76c152fbfd03
Step 2/5 : ADD . /workspace
 ---> 10ca596f6d20
Step 3/5 : WORKDIR /workspace
 ---> Running in 37a88d04d2a9
Removing intermediate container 37a88d04d2a9
 ---> 7f7982fbfaba
Step 4/5 : RUN pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple --ignore-installed PyYAML
 ---> Running in 877004f83473
Looking in indexes: https://mirrors.aliyun.com/pypi/simple 
  Downloading https://mirrors.aliyun.com/pypi/packages/ec/d6/a82d191ec058314b2b7cbee5635150f754ba1c6ffc05387bc9a57efe48b8/cryptacular-1.5.5.tar.gz
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Collecting zope.sqlalchemy
  Downloading https://mirrors.aliyun.com/pypi/packages/fa/83/459decec1dd2c14d60f9a360fff989c128abe545a1554a1da64b054a55d4/zope.sqlalchemy-1.3-py2.py3-none-any.whl
Collecting velruse>=1.0.3
  Downloading https://mirrors.aliyun.com/pypi/packages/8f/0b/d47ea894587f3155f8c4520aa74d57c856189d0bbe27e831881d655a3386/PasteDeploy-2.1.1-py2.py3-none-any.whl
Building wheels for collected packages: cryptacular
  Building wheel for cryptacular (PEP 517): started
  Building wheel for cryptacular (PEP 517): finished with status 'done'
  Created wheel for cryptacular: filename=cryptacular-1.5.5-cp37-abi3-manylinux2010_x86_64.whl size=52452 sha256=93037b68313c3d86df4c8cab9d0cc0866d1579cb7399410c7903b56eb2ff0067
  Stored in directory: /root/.cache/pip/wheels/dd/c7/11/721f100da8477396b1f8fcfa2d23c801d5bac07d0e2d82dc0d
Successfully built cryptacular
Building wheels for collected packages: apex, velruse, pbkdf2, anykeystore
  Building wheel for apex (setup.py): started
  Building wheel for apex (setup.py): finished with status 'done'
  Created wheel for apex: filename=apex-0.9.10.dev0-cp37-none-any.whl size=46468 sha256=c68745de219dd6169195cfec426e528cd5f5f932bd3cb7ddbc22817a9827cfea
  Stored in directory: /root/.cache/pip/wheels/b8/f0/7a/2fc4cf8a70bfc0981f7009a2146685d06ee220398c0b780acf
  Building wheel for velruse (setup.py): started
  Building wheel for velruse (setup.py): finished with status 'done'
  Created wheel for velruse: filename=velruse-1.1.1-cp37-none-any.whl size=50923 sha256=c300b70b745467b6b075bec09d6b2a11ab3524f6de31605431a62308613648e3
  Stored in directory: 
Successfully built apex velruse pbkdf2 anykeystore
Installing collected packages: PyYAML, Cython, numpy, opencv-python, typing-extensions, torch, pyparsing, kiwisolver, six, cycler, pillow
Removing intermediate container 877004f83473
 ---> 5c40d92c4bc1
Step 5/5 : CMD ["sh", "run.sh"]
 ---> Running in 41c2daf77fbc
Removing intermediate container 41c2daf77fbc
 ---> 603e3fe4452c
Successfully built 603e3fe4452c
Successfully tagged registry.cn-shenzhen.aliyuncs.com/nine_percent/tianchi_submit:1.0

在構建完鏡像以后，進入鏡像：

先查看一下對應的ID:

pdluser@pdluser-System-Product-Name:~$ sudo docker images
[sudo] pdluser 的密碼：
REPOSITORY                                                      TAG                 IMAGE ID            CREATED             SIZE
registry.cn-shenzhen.aliyuncs.com/nine_percent/tianchi_submit   1.0                 b773b4e52e7a        4 minutes ago       11.2GB
<none>                                                          <none>              f99df53cc33c        23 hours ago        7.92GB
registry.cn-shanghai.aliyuncs.com/tcc-public/pytorch            1.4-cuda10.1-py3    76c152fbfd03        13 months ago       7.56GB
registry.cn-shanghai.aliyuncs.com/tcc-public/python             3                   a4cc999cf2aa        21 months ago       929MB

進入第一個鏡像，b7:

(torch16) pdluser@pdluser-System-Product-Name:~/project/tianchi_demo$ sudo docker run -it b7 /bin/bash
root@2a128d20af63:/workspace#

在這里運行run.sh，測驗成功就可以提交了，

下一步將鏡像推送到Registry:

$ sudo docker login --username=用戶名 registry.cn-shenzhen.aliyuncs.com
$ sudo docker tag [ImageId] registry.cn-shenzhen.aliyuncs.com/nine_percent/tianchi_submit:[鏡像版本號]
$ sudo docker push registry.cn-shenzhen.aliyuncs.com/nine_percent/tianchi_submit:[鏡像版本號]

3、遇到的問題

在進行build的時候，發現以下問題，ERROR: Double requirement given: PyYAML>=5.3 (from -r requirements.txt (line 10)) (already in PyYAML, name=‘PyYAML’)

通過把yaml的等級要求去掉，就不會報錯了，

用到opencv的時候也出現了報錯：

Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.7/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

解決方案是在dockerfile中添加以下內容：

RUN apt update
RUN apt install libgl1-mesa-glx
RUN apt-get install -y libglib2.0-0

但是會遇到以下問題：

這樣改動dockfile，避免互動：

RUN DEBIAN_FRONTEND=noninteractive apt update -y
RUN DEBIAN_FRONTEND=noninteractive apt install libgl1-mesa-glx -y
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libglib2.0-0

第一次提交出錯：

啊，連續兩次錯誤了，

之后調節了一下檔案存放位置，和對應的命令，終于提交成功了，可喜可賀可喜可賀，

吐槽：這個docker雖然很不錯，但是入門還是有一定門檻的，我總結了一下使用程序中經常用到的知識點：https://blog.csdn.net/DD_PP_JJ/article/details/113902874 可以參考一下，整個程序docker卡殼時間比較久，查看了一下群友推薦的資料，感徑訓是對docker了解比較局限，每次build的時候都需要從遠端下載鏡像，非常麻煩，每次build要花很久很久，遇到了許多dockerfile的相關的問題，每次處理都需要重新下載，感覺很麻煩，提交比賽結果也是非常漫長，push要花很久時間，好麻煩，另外，不知道為何我再運行docker -v的時候發現映射不到容器中，這個問題還沒有解決，

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/262142.html

標籤：AI

上一篇：你真的了解Python嗎？這篇文章可以讓你了解90%

下一篇：【人生苦短，我學 Python】進階篇——檔案處理（Day15）