Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3x] add recommendation examples #1844

Merged
merged 16 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/3x/PT_MXQuant.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ user_model = convert(model=user_model)

## Examples

- PyTorch [huggingface models](/examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/mx)
- PyTorch [huggingface models](/examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/mx_quant)


## Reference
Expand Down
99 changes: 53 additions & 46 deletions examples/.config/model_params_pytorch_3x.json
Original file line number Diff line number Diff line change
@@ -1,46 +1,53 @@
{
"pytorch": {
"gpt_j_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"gpt_j_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"llama2_7b_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"llama2_7b_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"opt_125m_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 8
},
"opt_125m_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 8
}
}
}
{
"pytorch": {
"gpt_j_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"gpt_j_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"llama2_7b_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"llama2_7b_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 1
},
"opt_125m_ipex":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 8
},
"opt_125m_ipex_sq":{
"model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant",
"dataset_location": "",
"input_model": "",
"main_script": "run_clm_no_trainer.py",
"batch_size": 8
},
"dlrm_ipex": {
"model_src_dir": "recommendation/dlrm/static_quant/ipex",
"dataset_location": "/mnt/local_disk3/dataset/dlrm/dlrm/input",
"input_model": "/mnt/local_disk3/dataset/dlrm/dlrm/dlrm_weight/tb00_40M.pt",
"main_script": "dlrm_s_pytorch.py",
"batch_size": 16384
}
}
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Run

## Run WOQ MX FP4 model

``` python
python run_clm_no_trainer.py --model [model_name_or_id] --quantize --accuracy --tasks lambada_openai --w_dtype fp4 --woq
```
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Code of Conduct

Facebook has adopted a Code of Conduct that we expect project participants to adhere to.
Please read the [full text](https://code.fb.com/codeofconduct/)
so that you can understand what actions will and will not be tolerated.
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Contributing to DLRM
We want to make contributing to this project as easy and transparent as
possible.

## Pull Requests
We actively welcome your pull requests.

1. Fork the repo and create your branch from `master`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").

## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only need
to do this once to work on any of Facebook's open source projects.

Complete your CLA here: <https://code.facebook.com/cla>

## Issues
We use GitHub issues to track public bugs. Please ensure your description is
clear and has sufficient instructions to be able to reproduce the issue.

Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
disclosure of security bugs. In those cases, please go through the process
outlined on that page and do not file a public issue.

## Coding Style
* 4 spaces for indentation rather than tabs
* 80 character line length
* in general, please maintain a consistent style with the rest of the code

## License
By contributing to DLRM, you agree that your contributions will be licensed
under the LICENSE file in the root directory of this source tree.
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) Facebook, Inc. and its affiliates.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
Step-by-Step
============

This document is used to list steps of reproducing PyTorch DLRM tuning zoo result. and original DLRM README is in [DLRM README](https://github.com/facebookresearch/dlrm/blob/master/README.md)

> **Note**
>
> Please ensure your PC have >370G memory to run DLRM
> IPEX version >= 1.11

# Prerequisite

### 1. Environment

PyTorch 1.11 or higher version is needed with pytorch_fx backend.

```shell
# Install dependency
cd examples/pytorch/recommendation/dlrm/quantization/ptq/ipex
pip install -r requirements.txt
```
> Note: Validated PyTorch [Version](/docs/source/installation_guide.md#validated-software-environment).

### 2. Prepare Dataset

The code supports interface with the [Criteo Terabyte Dataset](https://labs.criteo.com/2013/12/download-terabyte-click-logs/)

1. download the raw data files day_0.gz, ...,day_23.gz and unzip them.
2. Specify the location of the unzipped text files day_0, ...,day_23, using --raw-data-file=<path/day> (the day number will be appended automatically), please refer "Run" command.

### 3. Prepare pretrained model

Download the DLRM PyTorch weights (`tb00_40M.pt`, 90GB) from the
[MLPerf repo](https://github.com/mlcommons/inference/tree/master/recommendation/dlrm/pytorch#more-information-about-the-model-weights)

# Run
### tune with INC
```shell
cd examples/pytorch/recommendation/dlrm/quantization/ptq/ipex
bash run_quant.sh --input_model="/path/of/pretrained/model" --dataset_location="/path/of/dataset"
```

### benchmark
```shell
bash run_benchmark.sh --input_model="/path/of/pretrained/model" --dataset_location="/path/of/dataset" --mode=accuracy --int8=true
```


Examples of enabling Intel® Neural Compressor
=========================

This is a tutorial of how to enable DLRM model with Intel® Neural Compressor.


### Code update

We need update dlrm_s_pytorch.py like below

```python
# evaluation
def eval_func(model):
args.int8 = model.is_quantized
with torch.no_grad():
return inference(
args,
model,
best_acc_test,
best_auc_test,
test_ld,
trace=args.int8
)

# calibration
def calib_fn(model):
calib_number = 0
for X_test, lS_o_test, lS_i_test, T in train_ld:
if calib_number < 102400:
model(X_test, lS_o_test, lS_i_test)
calib_number += 1

from neural_compressor.torch.quantization import SmoothQuantConfig, autotune, TuningConfig
tune_config = TuningConfig(config_set=SmoothQuantConfig.get_config_set_for_tuning())
dlrm = autotune(
dlrm,
tune_config=tune_config,
eval_fn=eval_func,
run_fn=calib_fn,
)
dlrm.save("saved_results")
```
Loading
Loading