Skip to content

Commit

Permalink
Added AdaRound and RNN QAT results (#654)
Browse files Browse the repository at this point in the history
Signed-off-by: Abhi Khobare <[email protected]>
  • Loading branch information
quic-akhobare authored Jun 18, 2021
1 parent 11d7ab3 commit f067e52
Showing 1 changed file with 91 additions and 13 deletions.
104 changes: 91 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,34 +79,110 @@ Some recently added features include

## Results

AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. As an example of accuracy maintained, the DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.
AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning.


<h4>DFQ</h4>

The DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9%
loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.

<table style="width:50%">
<tr>
<th style="width:80px">Models</th>
<th>FP32 </th>
<th>FP32</th>
<th>INT8 Simulation </th>
</tr>
<tr>
<td>MobileNet v2 (top1)</td>
<td>71.72%</td>
<td>71.08%</td>
<td align="center">71.72%</td>
<td align="center">71.08%</td>
</tr>
<tr>
<td>ResNet 50 (top1)</td>
<td>76.05%</td>
<td>75.45%</td>
<td align="center">76.05%</td>
<td align="center">75.45%</td>
</tr>
<tr>
<td>DeepLab v3 (mIOU)</td>
<td>72.65%</td>
<td>71.91%</td>
<td align="center">72.65%</td>
<td align="center">71.91%</td>
</tr>
</table>
<br>

<h4>AdaRound (Adaptive Rounding)</h4>
<h5>ADAS Object Detect</h5>
<p>For this example ADAS object detection model, which was challenging to quantize to 8-bit precision,
AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
<table style="width:50%">
<tr>
<th style="width:80px" colspan="15">Configuration</th>
<th>mAP - Mean Average Precision</th>
</tr>
<tr>
<td colspan="15">FP32</td>
<td align="center">82.20%</td>
</tr>
<tr>
<td colspan="15">Nearest Rounding (INT8 weights, INT8 acts)</td>
<td align="center">49.85%</td>
</tr>
<tr>
<td colspan="15">AdaRound (INT8 weights, INT8 acts)</td>
<td align="center" bgcolor="#add8e6">81.21%</td>
</tr>
</table>

<h5>DeepLabv3 Semantic Segmentation</h5>
<p>For some models like the DeepLabv3 semantic segmentation model, AdaRound can even quantize the model weights to
4-bit precision without a significant drop in accuracy.</p>
<table style="width:50%">
<tr>
<th style="width:80px" colspan="15">Configuration</th>
<th>mIOU - Mean intersection over union</th>
</tr>
<tr>
<td colspan="15">FP32</td>
<td align="center">72.94%</td>
</tr>
<tr>
<td colspan="15">Nearest Rounding (INT4 weights, INT8 acts)</td>
<td align="center">6.09%</td>
</tr>
<tr>
<td colspan="15">AdaRound (INT4 weights, INT8 acts)</td>
<td align="center" bgcolor="#add8e6">70.86%</td>
</tr>
</table>
<br>

AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining accuracy within approx. 1% of the original uncompressed model.
<h4>Quantization for Recurrent Models</h4>
<p>AIMET supports quantization simulation and quantization-aware training (QAT) for recurrent models (RNN, LSTM, GRU).
Using QAT feature in AIMET, a DeepSpeech2 model with bi-directional LSTMs can be quantized to 8-bit precision with
minimal drop in accuracy.</p>

<table style="width:50%">
<tr>
<th>DeepSpeech2 <br>(using bi-directional LSTMs)</th>
<th>Word Error Rate</th>
</tr>
<tr>
<td>FP32</td>
<td align="center">9.92%</td>
</tr>
<tr>
<td>INT8</td>
<td align="center">10.22%</td>
</tr>
</table>

<br>

<h4>Model Compression</h4>
<p>AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18,
compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining
accuracy within approx. 1% of the original uncompressed model.</p>

<table style="width:50%">
<tr>
Expand All @@ -116,16 +192,18 @@ AIMET can also significantly compress models. For popular models, such as Resnet
</tr>
<tr>
<td>ResNet18 (top1)</td>
<td>69.76%</td>
<td>68.56%</td>
<td align="center">69.76%</td>
<td align="center">68.56%</td>
</tr>
<tr>
<td>ResNet 50 (top1)</td>
<td>76.05%</td>
<td>75.75%</td>
<td align="center">76.05%</td>
<td align="center">75.75%</td>
</tr>
</table>

<br>

## Installation Instructions
To install and use the pre-built version of the AIMET package, please follow one of the below links:
- [Install and run AIMET in *Ubuntu* environment](./packaging/install.md)
Expand Down

0 comments on commit f067e52

Please sign in to comment.