Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dnnl ep #903

Merged
merged 21 commits into from
Jul 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .azure-pipelines/scripts/codeScan/pyspelling/inc_dict.txt
Original file line number Diff line number Diff line change
Expand Up @@ -495,6 +495,7 @@ dnf
dnn
dnnl
DNNL
DnnlExecutionProvider
Dockerfile
doclist
docstrings
Expand Down Expand Up @@ -563,6 +564,7 @@ enum
env
environ
ep
eps
eq
erf
Erf
Expand Down
14 changes: 12 additions & 2 deletions docs/source/mixed_precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Coope
</p>

## Mixed Precision Support Matrix

<table class="center">
<thead>
<tr>
Expand Down Expand Up @@ -48,7 +49,7 @@ The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Coope
<td align="left">:x:</td>
</tr>
<tr>
<td rowspan="3" align="left">ONNX Runtime</td>
<td rowspan="4" align="left">ONNX Runtime</td>
<td align="left">CPUExecutionProvider</td>
<td align="left">MLAS</td>
<td align="left">"default"</td>
Expand All @@ -72,6 +73,14 @@ The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Coope
<td align="left">&#10004;</td>
<td align="left">&#10004;</td>
</tr>
<tr>
<td align="left">DnnlExecutionProvider</td>
<td align="left">OneDNN</td>
<td align="left">"onnxrt_dnnl_ep"</td>
<td align="left">cpu</td>
<td align="left">&#10004;</td>
<td align="left">:x:</td>
</tr>
<tr>
<td rowspan="2" align="left">Tensorflow</td>
<td align="left">Tensorflow</td>
Expand Down Expand Up @@ -162,4 +171,5 @@ converted_model.save('./path/to/save/')
- Quick started with [helloworld example](/examples/helloworld/tf_example3)
- PyTorch [ResNet18](/examples/pytorch/image_recognition/torchvision_models/mixed_precision/resnet18)
- IPEX [DistilBERT base](/examples/pytorch/nlp/huggingface_models/question-answering/mixed_precision/ipex)
- Tensorflow [ResNet50](/examples/tensorflow/image_recognition/tensorflow_models/resnet50_v1/mixed_precision)
- Tensorflow [ResNet50](/examples/tensorflow/image_recognition/tensorflow_models/resnet50_v1/mixed_precision)
- ONNX Runtime [Bert base](/examples/onnxrt/nlp/huggingface_model/text_classification/mix_precision)
8 changes: 7 additions & 1 deletion docs/source/quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ Intel(R) Neural Compressor support multi-framework: PyTorch, Tensorflow, ONNX Ru
<td align="left">cpu</td>
</tr>
<tr>
<td rowspan="3" align="left">ONNX Runtime</td>
<td rowspan="4" align="left">ONNX Runtime</td>
<td align="left">CPUExecutionProvider</td>
<td align="left">MLAS</td>
<td align="left">"default"</td>
Expand All @@ -470,6 +470,12 @@ Intel(R) Neural Compressor support multi-framework: PyTorch, Tensorflow, ONNX Ru
<td align="left">"onnxrt_cuda_ep"</td>
<td align="left">gpu</td>
</tr>
<tr>
<td align="left">DnnlExecutionProvider</td>
<td align="left">OneDNN</td>
<td align="left">"onnxrt_dnnl_ep"</td>
<td align="left">cpu</td>
</tr>
<tr>
<td rowspan="2" align="left">Tensorflow</td>
<td align="left">Tensorflow</td>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
Step-by-Step
============

This example load a language translation model and confirm its accuracy and speed based on [GLUE data](https://gluebenchmark.com/).

# Prerequisite

## 1. Environment
```shell
git clone -b dnnl_ep --depth 1 https://github.com/intel/neural-compressor.git
cd neural-compressor
pip install -e ./

cd examples/onnxrt/nlp/huggingface_model/text_classification/mix_precision/
pip install -r requirements.txt
```
> Note: Validated ONNX Runtime [Version](/docs/source/installation_guide.md#validated-software-environment).

## 2. Prepare Model

Supported model identifier from [huggingface.co](https://huggingface.co/):

| Model Identifier |
|:-----------------------------------------------:|
| Intel/bert-base-uncased-mrpc |
| Intel/roberta-base-mrpc |
| Intel/xlm-roberta-base-mrpc |
| Intel/camembert-base-mrpc |
| distilbert-base-uncased-finetuned-sst-2-english |
| Alireza1044/albert-base-v2-sst2 |
| Intel/MiniLM-L12-H384-uncased-mrpc |
| philschmid/MiniLM-L6-H384-uncased-sst2 |
| bert-base-cased-finetuned-mrpc |
| Intel/electra-small-discriminator-mrpc |
| M-FAC/bert-mini-finetuned-mrpc |
| Intel/xlnet-base-cased-mrpc |
| Intel/bart-large-mrpc |

```bash
optimum-cli export onnx --model Intel/bert-base-uncased-mrpc --task text-classification <path to export onnx model>
```

## 3. Prepare Dataset
Download the GLUE data with `prepare_data.sh` script.

```shell
export GLUE_DIR=/path/to/glue_data
export TASK_NAME=MRPC # or SST

bash prepare_data.sh --data_dir=$GLUE_DIR --task_name=$TASK_NAME
```

# Run

If the hardware doesn't support bf16 instruction, please set flag as below to force bf16 conversion (this way will be deprecated):

```shell
export FORCE_BF16=1
```

## 1. Only mixed precision conversion

```bash
bash run.sh --input_model=path/to/model \ # model path as *.onnx
--output_model=path/to/model_tune \ # model path as *.onnx
```

## 2. Mixed precision conversion + accuracy evaluation

Please make sure DnnlExecutionProvider is in available providers list to execute evaluation.

```bash
bash eval.sh --input_model=path/to/model \ # model path as *.onnx
--output_model=path/to/model_tune \ # model path as *.onnx
--dataset_location=path/to/glue/data \
--batch_size=batch_size \ # optional
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
#!/bin/bash
set -x

function main {
init_params "$@"
run_tuning
}

# init params
function init_params {
for var in "$@"
do
case $var in
--input_model=*)
input_model=$(echo $var |cut -f2 -d=)
;;
--output_model=*)
output_model=$(echo $var |cut -f2 -d=)
;;
--dataset_location=*)
dataset_location=$(echo $var |cut -f2 -d=)
;;
--batch_size=*)
batch_size=$(echo $var |cut -f2 -d=)
;;
esac
done

}

# run_tuning
function run_tuning {

if [[ "${input_model}" =~ "bert-base-uncased" ]]; then
model_name_or_path="Intel/bert-base-uncased-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "roberta-base" ]]; then
model_name_or_path="Intel/roberta-base-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "xlm-roberta-base" ]]; then
model_name_or_path="Intel/xlm-roberta-base-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "camembert-base" ]]; then
model_name_or_path="Intel/camembert-base-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "distilbert-base" ]]; then
model_name_or_path="distilbert-base-uncased-finetuned-sst-2-english"
TASK_NAME='sst-2'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "albert-base" ]]; then
model_name_or_path="Alireza1044/albert-base-v2-sst2"
TASK_NAME='sst-2'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "MiniLM-L6" ]]; then
model_name_or_path="philschmid/MiniLM-L6-H384-uncased-sst2"
TASK_NAME='sst-2'
num_heads=12
hidden_size=384
fi
if [[ "${input_model}" =~ "MiniLM-L12" ]]; then
model_name_or_path="Intel/MiniLM-L12-H384-uncased-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=384
fi
if [[ "${input_model}" =~ "bert-base-cased" ]]; then
model_name_or_path="bert-base-cased-finetuned-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=384
fi
if [[ "${input_model}" =~ "xlnet-base-cased" ]]; then
model_name_or_path="Intel/xlnet-base-cased-mrpc"
TASK_NAME='mrpc'
num_heads=12
hidden_size=768
fi
if [[ "${input_model}" =~ "bert-mini" ]]; then
model_name_or_path="M-FAC/bert-mini-finetuned-mrpc"
TASK_NAME='mrpc'
num_heads=4
hidden_size=256
fi
if [[ "${input_model}" =~ "electra-small-discriminator" ]]; then
model_name_or_path="Intel/electra-small-discriminator-mrpc"
TASK_NAME='mrpc'
num_heads=4
hidden_size=256
fi
if [[ "${input_model}" =~ "bart" ]]; then
model_name_or_path="Intel/bart-large-mrpc"
TASK_NAME='mrpc'
num_heads=16
hidden_size=4096
fi

python main.py \
--model_name_or_path ${model_name_or_path} \
--model_path ${input_model} \
--output_model ${output_model} \
--data_path ${dataset_location} \
--batch_size ${batch_size-1} \
--task ${TASK_NAME} \
--num_heads ${num_heads} \
--hidden_size ${hidden_size} \
--do_eval
}

main "$@"



Loading