diff --git a/demo/mkldnn_quant/quant_aware/PaddleCV_mkldnn_quantaware_tutorial_cn.md b/demo/mkldnn_quant/quant_aware/PaddleCV_mkldnn_quantaware_tutorial_cn.md index 7835ab61d9f7a2..bcc8eebe5577ca 100644 --- a/demo/mkldnn_quant/quant_aware/PaddleCV_mkldnn_quantaware_tutorial_cn.md +++ b/demo/mkldnn_quant/quant_aware/PaddleCV_mkldnn_quantaware_tutorial_cn.md @@ -34,7 +34,7 @@ import numpy as np #### 2.1 量化训练 -量化训练流程可以参考 [分类模型的离线量化流程](https://paddlepaddle.github.io/PaddleSlim/tutorials/quant_aware_demo/) +量化训练流程可以参考 [分类模型的量化训练流程](https://paddlepaddle.github.io/PaddleSlim/tutorials/quant_aware_demo/) **注意量化训练过程中config参数:** - **quantize_op_types:** 目前CPU上支持量化 `depthwise_conv2d`, `mul`, `conv2d`, `matmul`, `transpose2`, `reshape2`, `pool2d`, `scale`。但是训练阶段插入fake quantize/dequantize op时,只需在前四种op前后插入fake quantize/dequantize ops,因为后面四种op `matmul`, `transpose2`, `reshape2`, `pool2d`的输入输出scale不变,将从前后方op的输入输出scales获得scales,所以`quantize_op_types` 参数只需要 `depthwise_conv2d`, `mul`, `conv2d`, `matmul` 即可。 diff --git a/docs/zh_cn/tutorials/image_classification_mkldnn_quant_aware_tutorial.md b/docs/zh_cn/tutorials/image_classification_mkldnn_quant_aware_tutorial.md index 558e0b915ee4fd..05f1748538b730 100644 --- a/docs/zh_cn/tutorials/image_classification_mkldnn_quant_aware_tutorial.md +++ b/docs/zh_cn/tutorials/image_classification_mkldnn_quant_aware_tutorial.md @@ -1,4 +1,4 @@ -# CPU部署预测INT8模型的精度和性能 +# CPU部署预测INT8模型 在Intel(R) Xeon(R) Gold 6271机器上,经过量化和DNNL加速,INT8模型在单线程上性能为原FP32模型的3~4倍;在 Intel(R) Xeon(R) Gold 6148,单线程性能为原FP32模型的1.5倍,而精度仅有极小下降。图像分类量化的样例教程请参考[图像分类INT8模型在CPU优化部署和预测](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/mkldnn_quant/quant_aware/PaddleCV_mkldnn_quantaware_tutorial_cn.md)。自然语言处理模型的量化请参考[ERNIE INT8 模型精度与性能复现](https://github.com/PaddlePaddle/benchmark/tree/master/Inference/c%2B%2B/ernie/mkldnn) diff --git a/docs/zh_cn/tutorials/index.rst b/docs/zh_cn/tutorials/index.rst index e6109b73b6c3bd..88302baac71074 100644 --- a/docs/zh_cn/tutorials/index.rst +++ b/docs/zh_cn/tutorials/index.rst @@ -6,6 +6,7 @@ :maxdepth: 1 image_classification_sensitivity_analysis_tutorial.md + image_classification_mkldnn_quant_aware_tutorial.md darts_nas_turorial.md paddledetection_slim_distillation_tutorial.md paddledetection_slim_nas_tutorial.md