Model | Rounding | Without BN | Pruning | modelSize(gzip) | Accuracy | Download |
---|---|---|---|---|---|---|
Mobilenet | 12M | 97.16% | download | |||
Mobilenet | √ | 3.2M | 97.05% | download | ||
Mobilenet | √ | 4.4M | 96.96% | download | ||
Mobilenet | √ | √ | 2.3M | 96.7% | download | |
Mobilenet | √ | 12M | 97.16% | download | ||
Mobilenet | √ | √ | 3.0M | 96.96% | download |
Given floating parameters V
, first our goal is to represent V
as 8-bit integers V'
. And then we transformed back V'
back into its approximate high-precision value by performing the inverse of the quantization operation. At last, we perform gzip to our quantized && inverse-quantized model. The whole process can reduces our model by 70%.
In the last, We apply gzip to compress the inverse-quantized model, and the compression ratio can be up to 70%.