Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.x SQ supports calib_func for auto-tune #1812

Closed
wants to merge 4 commits into from
Closed

Conversation

violetch24
Copy link
Contributor

Type of Change

sq supports calib_func for auto-tune, no need for dataloader

Description

Layer-wise & block-wise enable
Add ut check auto-tune
Check llm examples

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

@violetch24 violetch24 requested review from yintong-lu and xin3he May 23, 2024 12:47
@violetch24 violetch24 marked this pull request as ready for review May 23, 2024 12:47
Copy link

🌩️ Required checks status: Pending 🟡

Groups summary

🟡 Code Scan Tests workflow
Check ID Status Error details
Code-Scan no_status
Code-Scan (Bandit Code Scan Bandit) no_status
Code-Scan (DocStyle Code Scan DocStyle) no_status
Code-Scan (Pylint Code Scan Pylint) no_status

These checks are required after the changes to neural_compressor/torch/algorithms/smooth_quant/utility.py.

🟡 Model Tests 3x workflow
Check ID Status Error details
Model-Test-3x no_status
Model-Test-3x (Generate Report GenerateReport) no_status
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4) no_status
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb) no_status
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml) no_status

These checks are required after the changes to neural_compressor/torch/algorithms/smooth_quant/utility.py.

🟡 Unit Tests 3x-PyTorch workflow
Check ID Status Error details
UT-3x-Torch no_status
UT-3x-Torch (Coverage Compare CollectDatafiles) no_status
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch) no_status
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline) no_status

These checks are required after the changes to neural_compressor/torch/algorithms/smooth_quant/utility.py, test/3x/torch/quantization/test_smooth_quant.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

Copy link
Contributor

@xin3he xin3he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to #1810
Capture input data from run_fn and rebuild CapturedDataloader for SQ calibration

@@ -465,6 +473,9 @@ def forward_wrapper(model, input, device=torch.device("cpu")): # pragma: no cov
output = model(*input)
except:
output = model(input)
elif isinstance(input, zip):
for args, kwargs in input:
output = model(*args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point


block_modules = {}
for key in self.block_names:
block_modules[key] = get_module(self.model, key)
self._add_blockwise_observer(block_modules)

forward_wrapper(self.model, input, self.device) ##disable quant and get fp32 output
# get input args and kwargs for the first block, then do forward
total_block_args, total_block_kwargs = get_hidden_states(self.model, calib_sample_num, calib_func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we fused all output into once instead of per input? Will it impact the final result @yintong-lu

@violetch24 violetch24 closed this May 28, 2024
@violetch24
Copy link
Contributor Author

violetch24 commented May 28, 2024

See new design in #1821

@violetch24 violetch24 deleted the zixuan/3.x_sq_auto branch May 28, 2024 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants