-
Notifications
You must be signed in to change notification settings - Fork 258
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add hf example for onnxrt backend (#1342)
- Loading branch information
1 parent
a2db276
commit f4aeb5d
Showing
93 changed files
with
2,760 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
43 changes: 43 additions & 0 deletions
43
...ples/onnxrt/nlp/huggingface_model/question_answering/quantization/ptq/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Evaluate performance of ONNX Runtime(Huggingface Question Answering) | ||
>ONNX runtime quantization is under active development. please use 1.6.0+ to get more quantization support. | ||
This example load a language translation model and confirm its accuracy and speed based on [SQuAD]((https://rajpurkar.github.io/SQuAD-explorer/)) task. | ||
|
||
### Environment | ||
Please use latest onnx and onnxruntime version. | ||
|
||
### Prepare dataset | ||
You should download SQuAD dataset from [SQuAD dataset link](https://rajpurkar.github.io/SQuAD-explorer/). | ||
|
||
### Prepare model | ||
|
||
Supported model identifier from [huggingface.co](https://huggingface.co/): | ||
|
||
| Model Identifier | | ||
|:-----------------------------------------------:| | ||
| mrm8488/spanbert-finetuned-squadv1 | | ||
| salti/bert-base-multilingual-cased-finetuned-squad | | ||
|
||
|
||
```bash | ||
python export.py --model_name_or_path=mrm8488/spanbert-finetuned-squadv1 \ # or other supported model identifier | ||
``` | ||
|
||
### Quantization | ||
|
||
Dynamic quantize: | ||
|
||
```bash | ||
bash run_tuning.sh --input_model=/path/to/model \ # model path as *.onnx | ||
--output_model=/path/to/model_tune \ | ||
--config=qa_dynamic.yaml | ||
``` | ||
|
||
### Benchmark | ||
|
||
```bash | ||
bash run_benchmark.sh --input_model=/path/to/model \ # model path as *.onnx | ||
--config=qa_dynamic.yaml | ||
--mode=performance # or accuracy | ||
``` | ||
|
Oops, something went wrong.