version 1.23.0
What's New
- TF-enhanced calibration scheme has been accelerated using a custom CUDA kernel. Runs significantly faster now.
- Installation instructions are now combined with rest of the documentation (User-Guide and API docs)
PyTorch
- Fixed backward pass of the fake-quantize (QcQuantizeWrapper) nodes to handle symmetric mode correctly
- Per-channel quantization is now enabled on a per-op-type basis
- Support for recursively excluding module from a root module in QuantSim
- Support for excluding layers when running model validator and model preparer
- Reduced memory usage in AdaRound
- Fixed bugs in AdaRound for per-channel quantization
- Made ConnectedGraph more robust when identifying custom layers
- Added jupyter notebook-based examples for the following features
- AutoQuant: Added support for sparse conv layers in QuantSim (experimental)
Keras
- Added support for Keras per-channel quantization
- Changed interface to CLE to accept a pre-compiled model
- Added jupyter notebook-based examples for the following features: Transformer quantization
TensorFlow
- Fix to avoid unnecessary indexing in AdaRound
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.23.0
- Installation guide: https://quic.github.io/aimet-pages/releases/latest/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.23.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.23.0/api_docs/index.html
- Documentation main page: https://quic.github.io/aimet-pages/index.html