Skip to content

Commit

Permalink
doc and example update for ITEX support (#1360)
Browse files Browse the repository at this point in the history
  • Loading branch information
lvliang-intel authored Oct 29, 2022
1 parent 74b3b38 commit 6ab5570
Show file tree
Hide file tree
Showing 139 changed files with 4,376 additions and 90 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ Intel® Neural Compressor
Intel® Neural Compressor, formerly known as Intel® Low Precision Optimization Tool, is an open-source Python library that runs on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep-learning frameworks for popular network compression technologies such as quantization, pruning, and knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help the user quickly find out the best quantized model. It also implements different weight-pruning algorithms to generate a pruned model with predefined sparsity goal. It also supports knowledge distillation to distill the knowledge from the teacher model to the student model.
Intel® Neural Compressor is a critical AI software component in the [Intel® oneAPI AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).

> **Note:**
> GPU support is under development.

**Visit the Intel® Neural Compressor online document website at: <https://intel.github.io/neural-compressor>.**

Expand Down Expand Up @@ -107,6 +105,10 @@ Intel® Neural Compressor supports systems based on [Intel 64 architecture or co
* Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake)
* Future Intel Xeon Scalable processor (code name Sapphire Rapids)

Intel® Neural Compressor supports the following Intel GPUs built on Intel's Xe architecture:

* [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html)

### Validated Software Environment

* OS version: CentOS 8.4, Ubuntu 20.04
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ Step-by-Step
============

This document is used to enable Tensorflow SavedModel format using Intel® Neural Compressor for performance only.
This example can run on Intel CPUs and GPUs.


## Prerequisite
Expand All @@ -17,14 +18,32 @@ pip install intel-tensorflow
```
> Note: Supported Tensorflow >= 2.4.0.
### 3. Prepare Pretrained model
### 3. Install Intel Extension for Tensorflow if needed
#### Tuning the model on Intel GPU(Mandatory)
Intel Extension for Tensorflow is mandatory to be installed for tuning the model on Intel GPUs.

```shell
pip install --upgrade intel-extension-for-tensorflow[gpu]
```
For any more details, please follow the procedure in [install-gpu-drivers](https://github.com/intel-innersource/frameworks.ai.infrastructure.intel-extension-for-tensorflow.intel-extension-for-tensorflow/blob/master/docs/install/install_for_gpu.md#install-gpu-drivers)

#### Tuning the model on Intel CPU(Experimental)
Intel Extension for Tensorflow for Intel CPUs is experimental currently. It's not mandatory for tuning the model on Intel CPUs.

```shell
pip install --upgrade intel-extension-for-tensorflow[cpu]
```

### 4. Prepare Pretrained model
Download the model from tensorflow-hub.

image recognition
- [mobilenetv1](https://hub.tensorflow.google.cn/google/imagenet/mobilenet_v1_075_224/classification/5)
- [mobilenetv2](https://hub.tensorflow.google.cn/google/imagenet/mobilenet_v2_035_224/classification/5)
- [efficientnet_v2_b0](https://hub.tensorflow.google.cn/google/imagenet/efficientnet_v2_imagenet1k_b0/classification/2)

## Write Yaml config file
In examples directory, there are mobilenet_v1.yaml, mobilenet_v2.yaml and efficientnet_v2_b0.yaml for tuning the model on Intel CPUs. The 'framework' in the yaml is set to 'tensorflow'. If running this example on Intel GPUs, the 'framework' should be set to 'tensorflow_itex' and the device in yaml file should be set to 'gpu'. The mobilenet_v1_itex.yaml, mobilenet_v2_itex.yaml and efficientnet_v2_b0_itex.yaml are prepared for the GPU case. We could remove most of items and only keep mandatory item for tuning. We also implement a calibration dataloader and have evaluation field for creation of evaluation function at internal neural_compressor.

## Run Command
```shell
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ model: # mandatory. neural_compres
name: efficientnet_v2_b0
framework: tensorflow # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: cpu # optional. default value is cpu, other value is gpu.

quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 5, 10, 50, 100 # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#
# Copyright (c) 2021 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

version: 1.0

model: # mandatory. neural_compressor uses this model name and framework name to decide where to save tuning history and deploy yaml.
name: efficientnet_v2_b0
framework: tensorflow_itex # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: gpu # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].
quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 5, 10, 50, 100 # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
dataloader:
dataset:
ImagenetRaw:
data_path: /path/to/calibration/dataset # NOTE: modify to calibration dataset location if needed
image_list: /path/to/calibration/label # data file, record image_names and their labels
transform:
PaddedCenterCrop:
size: 224
crop_padding: 32
Resize:
size: 224
interpolation: bicubic
Normalize:
mean: [123.675, 116.28, 103.53]
std: [58.395, 57.12, 57.375]

evaluation: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
accuracy: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
metric:
topk: 1 # built-in metrics are topk, map, f1, allow user to register new metric.
dataloader:
batch_size: 32
dataset:
ImagenetRaw:
data_path: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
image_list: /path/to/evaluation/label # data file, record image_names and their labels
transform:
PaddedCenterCrop:
size: 224
crop_padding: 32
Resize:
size: 224
interpolation: bicubic
Normalize:
mean: [123.675, 116.28, 103.53]
std: [58.395, 57.12, 57.375]
performance: # optional. used to benchmark performance of passing model.
iteration: 100
configs:
cores_per_instance: 4
num_of_instance: 7
dataloader:
batch_size: 1
dataset:
ImagenetRaw:
data_path: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
image_list: /path/to/evaluation/label # data file, record image_names and their labels
transform:
PaddedCenterCrop:
size: 224
crop_padding: 32
Resize:
size: 224
interpolation: bicubic
Normalize:
mean: [123.675, 116.28, 103.53]
std: [58.395, 57.12, 57.375]

tuning:
accuracy_criterion:
relative: 0.01 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
random_seed: 9527 # optional. random seed for deterministic tuning.
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ model: # mandatory. used to specif
name: mobilenet_v1
framework: tensorflow # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: cpu # optional. default value is cpu, other value is gpu.

quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 20, 50 # optional. default value is 100. used to set how many samples should be used in calibration.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#
# Copyright (c) 2021 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

model: # mandatory. used to specify model specific information.
name: mobilenet_v1
framework: tensorflow_itex # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: gpu # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].

quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 20, 50 # optional. default value is 100. used to set how many samples should be used in calibration.
dataloader:
batch_size: 10
dataset:
ImageRecord:
root: /path/to/calibration/dataset # NOTE: modify to calibration dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
activation:
algorithm: minmax
weight:
granularity: per_channel

evaluation: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
accuracy: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
metric:
topk: 1 # built-in metrics are topk, map, f1, allow user to register new metric.
dataloader:
batch_size: 32
dataset:
ImageRecord:
root: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
performance: # optional. used to benchmark performance of passing model.
iteration: 100
configs:
cores_per_instance: 4
num_of_instance: 7
dataloader:
batch_size: 1
dataset:
ImageRecord:
root: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224

tuning:
accuracy_criterion:
relative: 0.01 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
random_seed: 9527 # optional. random seed for deterministic tuning.
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ model: # mandatory. used to specif
name: mobilenet_v2
framework: tensorflow # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: cpu # optional. default value is cpu, other value is gpu.

quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 20, 50 # optional. default value is 100. used to set how many samples should be used in calibration.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
#
# Copyright (c) 2021 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

model: # mandatory. used to specify model specific information.
name: mobilenet_v2
framework: tensorflow_itex # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.

device: gpu # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].

quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 20, 50 # optional. default value is 100. used to set how many samples should be used in calibration.
dataloader:
batch_size: 10
dataset:
ImageRecord:
root: /path/to/calibration/dataset # NOTE: modify to calibration dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
activation:
algorithm: minmax
weight:
granularity: per_channel

op_wise: {
'MobilenetV2/expanded_conv/depthwise/depthwise': {
'activation': {'dtype': ['fp32']},
},
'MobilenetV2/Conv_1/Conv2D': {
'activation': {'dtype': ['fp32']},
}
}

evaluation: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
accuracy: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
metric:
topk: 1 # built-in metrics are topk, map, f1, allow user to register new metric.
dataloader:
batch_size: 32
dataset:
ImageRecord:
root: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
performance: # optional. used to benchmark performance of passing model.
iteration: 100
configs:
cores_per_instance: 4
num_of_instance: 7
dataloader:
batch_size: 1
dataset:
ImageRecord:
root: /path/to/evaluation/dataset # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224

tuning:
accuracy_criterion:
relative: 0.01 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
random_seed: 9527 # optional. random seed for deterministic tuning.
Loading

0 comments on commit 6ab5570

Please sign in to comment.