This tutorial demonstrates how to migrate quantization pipeline written using the OpenVINO Post-Training Optimization Tool (POT) to NNCF Post-Training Quantization API. This tutorial is based on Ultralytics YOLOv5 model and additionally it compares model accuracy between the FP32 precision and quantized INT8 precision models and runs a demo of model inference based on sample code from Ultralytics YOLOv5 with the OpenVINO backend.
The tutorial consists from the following parts:
- Convert YOLOv5 model to OpenVINO IR.
- Prepare dataset for quantization.
- Configure quantization pipeline.
- Perform model optimization.
- Compare accuracy FP32 and INT8 models
- Run model inference demo
- Compare performance FP32 and INT8 models
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.