Skip to content

Intel® Neural Compressor v1.14 Release

Compare
Choose a tag to compare
@kevinintel kevinintel released this 20 Sep 12:57
· 1797 commits to master since this release
073ac5f
  • Highlights
  • New Features
  • Improvements
  • Bug Fixes
  • Productivity
  • Examples

Highlights
We are excited to announce the release of Intel® Neural Compressor v1.14! We release new Pruning API for PyTorch, allowing users select better combinations of criteria, pattern and scheduler to achieve better pruning accuracy. This release also supports Keras input for TensorFlow quantization, and self-distilled quantization for better quantization accuracy.

New Features

  • Pruning/Sparsity
    • Support new structured sparse patterns N in M and NxM (commit 6cec70)
    • Add pruning criteria snip and snip momentum (commit 6cec70)
    • Add iterative pruning and decay types (commit 6cec70)
  • Quantization
    • Support different Keras formats (h5, keras, keras saved model) as input and output of TensorFlow saved model (commit 5a6f09)
    • Enable Distillation for Quantization (commit 03f1f3 & e20c76)
  • GUI
    • Add mixed precision (commit 26e902)

Improvement

  • Enhance tuning for Quantization with IPEX 1.12 to remove additional Quant/DeQuant (commit 192100)
  • Add upstream and download API for HuggingFace model hub, which can handle configuration files, tokenizer files and int8 model weights in the format of transformers (commit 46d945)
  • Align with Intel PyTorch extension new API (commit cc368a)
  • Add load with yaml and pt to be compatible with older PyTorch model saving type (commit a28705)

Bug Fixes

  • Quantization
    • Fix data type of ONNX Runtime quantization from fp64 to fp32 (commit cb7b48)
    • Fix MXNET config issue with default config (commit b75ff2)
  • Export
    • Fix export_to_onnx API (commit 158c7f)

Productivity

  • Support TensorFlow 2.10.0 (commit d6b6c9 & 8130e7)
  • Support OnnxRuntime 1.12 (commit 498ac4)
  • Export PyTorch QAT to Onnx (commit 029a63)
  • Add Tensorflow and PyTorch container tpp file (commit d245b5)

Examples

  • Add example of download from HuggingFace model hub and example of upstream models to the hub (commit 46d945)
  • Add notebooks for Neural Coder (commit 105db7)
  • Add 2 IPEX examples: bert_large (squad), distilbert_base (squad) (commit 192100)
  • ADD 2 DDP for prune once for all examples: roberta-base and Bert Base (commit 26a476)

Validated Configurations

  • Python 3.7, 3.8, 3.9, 3.10
  • Centos 8.3 & Ubuntu 18.04 & Win10
  • TensorFlow 2.9, 2.10
  • Intel TensorFlow 2.7, 2.8, 2.9
  • PyTorch 1.10.0+cpu, 1.11.0+cpu, 1.12.0+cpu
  • IPEX 1.10.0, 1.11.0, 1.12.0
  • MxNet 1.7, 1.9
  • ONNX Runtime 1.10, 1.11, 1.12