Training a Custom YOLO v4 Darknet Model on Azure and Running with Azure Live Video Analytics on IoT Edge

Train a custom YOLO v4 model

Prerequisites

SSH client or command line tool - for Windows try putty.exe
SCP client or command line tool - for Windows try pscp.exe
Azure Subscription - a Free Trial available for new customers.
Familiarity with Unix commands - e.g. vim, nano, wget, curl, etc.
Visual Object Tagging Tool - VoTT

Setup on Ubuntu (18.04) Virtual Machine in Azure and run test

Set up an N-series Virtual Machine by using the michhar/darknet-azure-vm-ubuntu-18.04 project VM setup.
SSH into the Ubuntu DSVM w/ username and password (of if used ssh key, use that)
- If this is a corporate subscription, may need to delete an inbound port rule under “Networking” in the Azure Portal (delete Cleanuptool-Deny-103)
Test the Darknet executable by running the following.
- Get the YOLO v4 tiny weights
```
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights
```
- Run a test on a static image from repository. Run the following command and then give the path to a test image (look in the data folder for sample images e.g. data/giraffe.jpg). The coco.data gives the links to other necessary files. The yolov4-tiny.cfg specifies the architecture and settings for tiny YOLO v4.
```
./darknet detector test ./cfg/coco.data ./cfg/yolov4-tiny.cfg ./yolov4-tiny.weights
```
- Check predictions.jpg for the bounding boxes overlaid on the image. You may "shell copy" (SCP) this file down to your machine to view it or alternatively remote desktop into the machine with a program like X2Go.

Train with Darknet

Label some test data locally (aim for about 500-1000 bounding boxes drawn, noting that less will result is less accurate results for those classes)
- Label data with VoTT and export as json
- Convert the json files to YOLO .txt files by running the following script (vott2.0_to_yolo.py). In this script, a change must be made. Update line 13 (LABELS = {'helmet': 0, 'no_helmet': 1}) to reflect your classes. Running this script should result in one .txt file per .json VoTT annotation file. The .txt files are the YOLO format that darknet can use. Run this conversion script as follows, for example.
```
python vott2.0_to_yolo.py --annot-folder path_to_folder_with_json_files --out-folder new_folder_for_txt_annotations
```
- Darknet will need a specific folder structure. Structure the data folder as follows where in the data/img folder the image is placed along with the .txt annotation file.
```
data/
    img/
        image1.jpg
        image1.txt
        image2.jpg
        image2.txt
        ...
    train.txt
    valid.txt
    obj.data
    obj.names
```
- obj.data is a general file to direct darknet to the other data-related files and model folder. It looks simliar to the following with necessary changes to classes for your scenario.
```
classes = 2
train  = build/darknet/x64/data/train.txt
valid  = build/darknet/x64/data/valid.txt
names = build/darknet/x64/data/obj.names
backup = backup/
```
- obj.names contains the class names, one per line.
- train.txt and valid.txt should look as follows, for example. Note, train.txt is the training images and is a different subset from the smaller list found in valid.txt. As a general rule, 5-10% of the image paths should be placed in valid.txt. These should be randomly distributed.
```
build/darknet/x64/data/img/image1.jpg
build/darknet/x64/data/img/image5.jpg
...
```
- These instructions may also be found in How to train on your own data.
Upload data to the DSVM as follows.
- Zip the data folder (zip -r data.zip data if using the command line) and copy (scp data.zip <username>@<public IP or DNS name>:~/darknet/build/darknet/x64/ - use pscp.exe on Windows) the data up to VM (may need to delete networking rule Cleanuptool-Deny-103 again if this gives a timeout error). Note the data.zip is placed in the darknet/build/darknet/x64 folder. This is where darknet will look for the data.
- Log in to the DSVM with SSH
- On the DSVM, unzip the compressed data.zip found, now, in the folder darknet/build/darknet/x64.
Read through How to train on your own data from the Darknet repo, mainly on updating the .cfg file. We will be using the tiny archicture of YOLO v4 so will calculate anchors and update the config accordingly (the cfg/yolov4-tiny-custom.cfg). The following summarizes the changes for reference, but please refer to the Darknet repo for more information/clarification.
- Calculate anchor boxes (especially important if you have very big or very small objects on average). We use -num_of_clusters 6 because of the tiny architecture configuration. IMPORTANT: make note of these anchors (darknet creates a file for you called anchors.txt) for the section on converting the model to TFLite so you will need them later on.
```
./darknet detector calc_anchors build/darknet/x64/data/obj.data -num_of_clusters 6 -width 416 -height 416`
```
- Configure the cfg file (you will see a file called cfg/yolov4-tiny-custom.cfg). Open the file with an editor like vim or nano. Modify the following to your scenario. For example, this header (net block):
```
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=16
subdivisions=2
...

learning_rate=0.00261
burn_in=1000
max_batches = 4000
policy=steps
steps=3200,3600
...
```
  - Info for the yolo blocks (in each YOLO block or just before - there are two blocks in the tiny architecture):
    - Class number – change to your number of classes (each YOLO block)
    - Filters – (5 + num_classes)*3 (neural net layer before each YOLO block)
    - Anchors – these are also known as anchor boxes (each YOLO block) - use the calculated anchors from the previous step.
Train the model with the following two commands.
- This will download the base model weights:
```
 wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.conv.29
```
- This will run the training experiment (where -clear means it will start training from the base model just downloaded rather than already present weights in the backup folder; the backup folder is where the weights will show up after training).
```
./darknet detector train build/darknet/x64/data/obj.data cfg/yolov4-tiny-custom.cfg yolov4-tiny.conv.29 -map -dont_show -clear
```

TensorFlow Lite conversion for fast inferencing

If using the michhar/darknet-azure-vm-ubuntu-18.04 GitHub VM setup as instructed above, the hunglc007/tensorflow-yolov4-tflite project will have already been cloned and the correct Python environment set up with TensorFlow 2.
You can use an editor like VSCode or any other text editor will work for the following.
- Change coco.names to obj.names in core/config.py
- Update the anchors on line 17 of core/config.py to match the anchor sizes used to train the model, e.g.:
```
__C.YOLO.ANCHORS_TINY         = [ 81, 27,  28, 80,  58, 51,  76,100, 109, 83,  95,246]
```
- Place obj.names file from your Darknet project in the data/classes folder.
Convert from Darknet to TensorFlow Lite (with quantization) with the two steps as follows. Use the weights from your Darknet experiment (as found in the ~/darknet/backup/ folder).
- In the tensorflow-yolov4-tflite folder activate the Python environment with: source env/bin/activate
- Save the model to TensorFlow protobuf intermediate format.
```
python save_model.py --weights yolov4-tiny-custom_best.weights --output ./checkpoints/yolov4-tiny-416-tflite2 --input_size 416 --model yolov4 --framework tflite --tiny
```
- Convert the protobuf model weights to TFLite format with quantization.
```
python convert_tflite.py --weights ./checkpoints/yolov4-tiny-416-tflite2 --output ./checkpoints/yolov4-tiny-416-fp16.tflite --quantize_mode float16
```
[Optional] Run the video test on remote desktop (recommended to use X2Go client for Windows or Mac with your VM user and IP address) to check that everything is ok. Once X2Go has connected and you have a remote desktop instance running, open a terminal window ("terminal emulator" program).
- Navigate to the project folder.
```
cd tensorflow-yolov4-tflite
```
- Run the video demo.
```
python detectvideo.py --framework tflite --weights ./checkpoints/yolov4-tiny-416-fp16.tflite --size 416 --tiny --model yolov4 --video <name of your video file> --output <new name for output result video> --score 0.4
```
- You can then navigate to the output video file and play it with VLC in the remote desktop environment or download the video to play locally.

Azure Live Video Analytics on IoT Edge

If you wish to start from this point (do not have a trained model) please refer to the releases (v0.1) for the .tflite model, obj.names file, anchors (in notes) and sample video (.mkv file) to create your RTSP server for simulation: https://github.com/michhar/yolov4-darknet-notes/releases/tag/v0.1.

Prerequisites

On your development machine you will need the following.

git command line tool or client such as GitHub Desktop
SCP client or command line tool - for Windows try pscp.exe
A sample video in .mkv format (only some audio formats are supported so you may see an error regarding audio format - you may wish to strip audio in this case for the simulator)
Your .tflite model, anchors and obj.names files
Docker - such as Docker Desktop
VSCode and Azure IoT Tools extension (search "Azure IoT Tools" in extensions withing VSCode)
.NET Core 3.1 SDK - download
Azure CLI - download and install
curl command line tool - download curl

On Azure:

Have gone through the this Live Video Analytics quickstart and the Live Video Analytics cloud to device sample console app to set up the necessary Azure Resources and learn how to use VSCode to see the results with .NET app.
- OR have the following Azure resources provisioned:

Create an RTSP simulator

Create a custom RTSP simulator with your video for inferencing with LVA with live555 media server
- Clone the official Live Video Analytics GitHub repo: git clone https://github.com/Azure/live-video-analytics.git
- Open the repository folder in VSCode to make it easier to modify files
- Go to the RTSP simulator instructions: cd utilities/rtspsim-live555/
- Replace line 21 with your .mkv file (can use the ffmpeg command line tool to convert from other formats like .mp4 to .mkv)
- Copy your .mkv video file to the same folder as Dockerfile
- Build the docker image according to the Readme
- Push the docker image to your ACR according to the Readme
  - Login to ACR: az acr login --name myregistry
  - Use docker to push: docker push myregistry.azurecr.io/my-rtsp-sim:latest

Create the AI container for inferencing

To prepare the ML model wrapper code, from the base of the live-video-analytics folder:
- Go to the Docker container building instructions: cd utilities/video-analysis/yolov4-tflite-tiny
- Copy your .tflite model into the app folder
- Perform the following changes to files for your custom scenario:
  - In app/core/config.py:
    - Update the __C.YOLO.ANCHORS_TINY line to be the same as training Darknet
    - Update the __C.YOLO.CLASSES to be ./data/classes/obj.names
  - In app/data/classes folder:
    - Add your file called obj.names (with your class names, one per line)
  - In app/yolov4-tf-tiny-app.py
    - Update line 31 to use the name of your model
    - Update line 45 to be obj.names instead of coco.names
  - In the Dockerfile
  - We do not need to pull down the yolov4 base tflite model so delete line 19
- Follow instructions here to build, test, and push to ACR the docker image:
  - https://github.com/Azure/live-video-analytics/tree/master/utilities/video-analysis/yolov4-tflite-tiny

Running the LVA sample app

To run the sample app and view your inference results:

Clone the official Live Video Analytics CSharp sample app: git clone https://github.com/Azure-Samples/live-video-analytics-iot-edge-csharp.git

In the src/edge folder, update yolov3.template.json as follows.

Rename to yolov4.template.json

Update (or ensure this is the case) the runtime at the beginning of the file looks like:

"runtime": {
    "type": "docker",
    "settings": {
    "minDockerVersion": "v1.25",
    "loggingOptions": "",
        "registryCredentials": {
              "$CONTAINER_REGISTRY_USERNAME_myacr": {
                    "username": "$CONTAINER_REGISTRY_USERNAME_myacr",
                    "password": "$CONTAINER_REGISTRY_PASSWORD_myacr",
                    "address": "$CONTAINER_REGISTRY_USERNAME_myacr.azurecr.io"
              }
        }
    }
}

This section will ensure the deployment can find your custom rtspsim and yolov4 images in your ACR.

Change the yolov3 name to yolov4 as in the following modules section (the image location is an example) pointing the yolov4 module to the correct image location in your ACR.

"yolov4": {
    "version": "1.0",
    "type": "docker",
    "status": "running",
    "restartPolicy": "always",
    "settings": {
      "image": "myacr.azurecr.io/my-awesome-custom-yolov4:latest",
          "createOptions": {}
    }
}

For rtspsim module ensure the image points to your image in ACR (the image location is an example) and ensure the createOptions look as follows:

  "rtspsim": {
    "version": "1.0",
    "type": "docker",
    "status": "running",
    "restartPolicy": "always",
    "settings": {
      "image": "myacr.azurecr.io/my-rtsp-sim:latest",
      "createOptions": {

        "PortBindings": {
            "554/tcp": [
                {
                    "HostPort": "5001"
                }
            ]
         }
       }
     }
   }

Also, in the rtspsim module createOptions make sure to delete the folder bindings, so delete any section like:
```
"HostConfig": {
  "Binds": [
    "$INPUT_VIDEO_FOLDER_ON_DEVICE:/live/mediaServer/media"
  ]
}
```
- This will ensure that LVA looks in the rtspsim module for the video rather than on the IoT Edge device.

Make the appropriate changes to the .env file (this should be located in the src/edge folder:
- Update the CONTAINER_REGISTRY_USERNAME_myacr and CONTAINER_REGISTRY_PASSWORD_myacr
- Recall the .env file (you can modify in VSCode) should have the following format (fill in the missing parts for your Azure resources):
```
SUBSCRIPTION_ID=
RESOURCE_GROUP=
AMS_ACCOUNT=
IOTHUB_CONNECTION_STRING=
AAD_TENANT_ID=
AAD_SERVICE_PRINCIPAL_ID=
AAD_SERVICE_PRINCIPAL_SECRET=
INPUT_VIDEO_FOLDER_ON_DEVICE="/live/mediaServer/media"
OUTPUT_VIDEO_FOLDER_ON_DEVICE="/var/media"
APPDATA_FOLDER_ON_DEVICE="/var/lib/azuremediaservices"
CONTAINER_REGISTRY_USERNAME_myacr=
CONTAINER_REGISTRY_PASSWORD_myacr=
```
  - When you create the manifest template file in VSCode it will use these values to create the actual deployment manifest file.

In the src/cloud-to-device-console-app folder, make the appropriate changes to the operations.json.

In the "opName": "GraphTopologySet", update the topologyUrl to be the http extension topology as follows.

{
    "opName": "GraphTopologySet",
    "opParams": {
        "topologyUrl": "https://raw.githubusercontent.com/Azure/live-video-analytics/master/MediaGraph/topologies/httpExtension/topology.json"
    }
}

In the "opName": "GraphInstanceSet", update the rtspUrl value to have your video file name (here my_video.mkv) and inferencingUrl with "value": "http://yolov4/score", as in:

{
    "opName": "GraphInstanceSet",
    "opParams": {
        "name": "Sample-Graph-1",
        "properties": {
            "topologyName" : "InferencingWithHttpExtension",
            "description": "Sample graph description",
            "parameters": [
                {
                    "name": "rtspUrl",
                    "value": "rtsp://rtspsim:554/media/my_video.mkv"
                },
                {
                    "name": "rtspUserName",
                    "value": "testuser"
                },
                {
                    "name": "rtspPassword",
                    "value": "testpassword"
                },
                {
                    "name": "imageEncoding",
                    "value": "jpeg"
                },
                {
                    "name": "inferencingUrl",
                    "value": "http://yolov4/score"
                }
            ]
        }
    }
},

Make the appropriate changes to the appsettings.json, a file that you may need to create if you haven't done the quickstarts. It should look as follows and be located in the src/cloud-to-device-console-app folder.
```
{
    "IoThubConnectionString" : "connection_string_of_iothub",
    "deviceId" : "name_of_your_edge_device_in_iot_hub",
    "moduleId" : "lvaEdge"
}
```
- The IoT Hub connection string may be found in the Azure Portal under your IoT Hub -> Settings -> Shared access policies blade -> iothubowner Policy -> Connection string—primary key
Build the app with dotnet build from the src/cloud-to-device-console-app folder.
Run the app with dotnet run

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
vott2.0_to_yolo.py		vott2.0_to_yolo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training a Custom YOLO v4 Darknet Model on Azure and Running with Azure Live Video Analytics on IoT Edge

Table of contents

Train a custom YOLO v4 model

Prerequisites

Setup on Ubuntu (18.04) Virtual Machine in Azure and run test

Train with Darknet

TensorFlow Lite conversion for fast inferencing

Azure Live Video Analytics on IoT Edge

Prerequisites

Create an RTSP simulator

Create the AI container for inferencing

Running the LVA sample app

Links/references

About

Releases

Packages

Languages

License

michhar/yolov4-darknet-notes

Folders and files

Latest commit

History

Repository files navigation

Training a Custom YOLO v4 Darknet Model on Azure and Running with Azure Live Video Analytics on IoT Edge

Table of contents

Train a custom YOLO v4 model

Prerequisites

Setup on Ubuntu (18.04) Virtual Machine in Azure and run test

Train with Darknet

TensorFlow Lite conversion for fast inferencing

Azure Live Video Analytics on IoT Edge

Prerequisites

Create an RTSP simulator

Create the AI container for inferencing

Running the LVA sample app

Links/references

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages