Skip to content

Commit

Permalink
add Hyper-parameter Optimization algorithm (#786)
Browse files Browse the repository at this point in the history
* add hpo

* add ut&example, update code

* fix

Signed-off-by: Guo, Heng <[email protected]>

* add requirement

Signed-off-by: Guo, Heng <[email protected]>

* pylint

* add readme

* requir

* modify readme

Signed-off-by: Guo, Heng <[email protected]>

* modify api

* logger

* modify example and ut

* add hpo api&config

Signed-off-by: Guo, Heng <[email protected]>

* fix

* modify api

Signed-off-by: Guo, Heng <[email protected]>

* modify readme

Signed-off-by: Guo, Heng <[email protected]>

* spell

Signed-off-by: Guo, Heng <[email protected]>

* sync readme

Signed-off-by: Guo, Heng <[email protected]>

* modify readme

Signed-off-by: Guo, Heng <[email protected]>

---------

Signed-off-by: Guo, Heng <[email protected]>
  • Loading branch information
n1ck-guo authored Jul 27, 2023
1 parent 888b3bf commit 6613cfa
Show file tree
Hide file tree
Showing 14 changed files with 1,525 additions and 6 deletions.
3 changes: 2 additions & 1 deletion .azure-pipelines/scripts/codeScan/pylint/pylint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ pip install torch==1.12.0 \
onnxruntime_extensions \
tf_slim \
transformers \
flask==2.1.3
flask==2.1.3 \
xgboost

if [ "${scan_module}" = "neural_solution" ]; then
cd /neural-compressor
Expand Down
4 changes: 4 additions & 0 deletions .azure-pipelines/scripts/codeScan/pyspelling/inc_dict.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2698,3 +2698,7 @@ Vanhoucke
ONNXCommunityMeetup
luYBWA
pQ
xgb
xgboost
hpo
HPO
9 changes: 7 additions & 2 deletions docs/source/pruning.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,9 @@ Pruning

4. [Sparse Model Deployment](#sparse-model-deployment)

5. [Pruning With HPO](#pruning-with-hyperparameter-optimization)

5. [Reference](#reference)
6. [Reference](#reference)


## Introduction
Expand Down Expand Up @@ -104,7 +105,7 @@ Pruning patterns defines the rules of pruned weights' arrangements in space. Int
</div>


- Multi-head Attention Pruning (Work in progress)
- Multi-head Attention Pruning

Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. **We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim**. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)

Expand Down Expand Up @@ -386,6 +387,10 @@ Please refer to [pruning examples](../../examples/README.md#Pruning-1) for more

Particular hardware/software like [Intel Extension for Transformer](https://github.com/intel/intel-extension-for-transformers) are required to obtain inference speed and footprints' optimization for most sparse models. However, using [model slim](#click) for some special structures can obtain significant inference speed improvements and footprint reduction without the post-pruning deployment. In other words, you can achieve model acceleration directly under your training framework (PyTorch, etc.)

## Pruning with Hyperparameter Optimization
Intel® Neural Compressor currently support grid search, random, bayesian optimization and xgboost search algorithms for pruning with HPO.
For more details, please refer to [HPO document](../../neural_compressor/compression/hpo/README.md)

## Reference

[1] Namhoon Lee, Thalaiyasingam Ajanthan, and Philip Torr. SNIP: Single-shot network pruning based on connection sensitivity. In International Conference on Learning Representations, 2019.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Step-by-Step
============

This document presents step-by-step instructions for pruning Huggingface models with HPO feature using the Intel® Neural Compressor.

# Prerequisite
## 1. Environment
Python 3.6 or higher version is recommended.
The dependent packages are listed in `requirements.txt`, please install them as follows,
```shell
cd examples/pytorch/nlp/huggingface_models/text-classification/pruning/hpo/
pip install -r requirements.txt
```
## 2. Prepare Dataset

The dataset will be downloaded automatically from the datasets Hub.
See more about loading [huggingface dataset](https://huggingface.co/docs/datasets/loading_datasets.html)

# Run
To get tuned model and its accuracy:
```shell
python run_glue_no_trainer.py \
--model_name_or_path M-FAC/bert-mini-finetuned-mrpc \
--task_name mrpc \
--per_device_eval_batch_size 18 \
--per_device_train_batch_size 18 \
--do_prune

```
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
accelerate
datasets
sentencepiece
scipy
scikit-learn
protobuf
torch
evaluate
transformers
tqdm
xgboost

Loading

0 comments on commit 6613cfa

Please sign in to comment.