-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPU only release binary #302
Conversation
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
test/test_auto_round_hpu_only.py
Outdated
@@ -0,0 +1,3 @@ | |||
def test_import(): | |||
from auto_round import AutoRound | |||
from auto_round.export.export_to_itrex.export import export_to_itrex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WeiweiZhang1 the import is very confusing, better rename one of them
Signed-off-by: Sun, Xuehao <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
auto_round/export/__init__.py
Outdated
from .export_to_autoround.export import save_quantized_as_autoround | ||
from .export_to_awq.export import save_quantized_as_autoawq | ||
from auto_round.utils import LazyImport | ||
save_quantized_as_autogptq = LazyImport("auto_round.export.export_to_autogptq.export.save_quantized_as_autogptq") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a little ugly if others see the code, how about first exporting them in auto_round.export, then we could just call auto_round.export.save_xxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update by registering all export formats in __init__.py
, and only invoke the save function when necessary.
auto_round/export/__init__.py
Outdated
from auto_round.utils import LazyImport | ||
save_quantized_as_autogptq = LazyImport("auto_round.export.export_to_autogptq.export.save_quantized_as_autogptq") | ||
save_quantized_as_itrex = LazyImport("auto_round.export.export_to_itrex.export.save_quantized_as_itrex") | ||
QuantConfig = LazyImport("auto_round.export.export_to_itrex.config.QuantConfig") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better change the quantconfig to itrex_quantconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's specific to the export format, remove it from __init__.py
and use it with the full path: from auto_round.export.export_to_itrex import QuantConfig
.
auto_round/auto_quantizer.py
Outdated
from auto_round.utils import LazyImport | ||
qlinear_qbits = LazyImport("auto_round_extension.qbits.qlinear_qbits") | ||
qlinear_qbits_gptq = LazyImport("auto_round_extension.qbits.qlinear_qbits_gptq") | ||
qlinear_ipex_gptq = LazyImport("auto_round_extension.ipex.qlinear_ipex_gptq") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have no unit test for generation due to several reasons. Could you help have a test to make sure the change in this file is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to import them only when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local test results:
python main.py --format auto_round:gptq --task wikitext
python main.py --format auto_gptq --task wikitext
python main.py --format auto_round --task wikitext
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------|------:|------|-----:|---------------|---|------:|---|------|
|wikitext| 2|none | 0|bits_per_byte |↓ | 0.9433|± |N/A |
| | |none | 0|byte_perplexity|↓ | 1.9229|± |N/A |
| | |none | 0|word_perplexity|↓ |32.9958|± |N/A |
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------|------:|------|-----:|---------------|---|------:|---|------|
|wikitext| 2|none | 0|bits_per_byte |↓ | 0.9433|± |N/A |
| | |none | 0|byte_perplexity|↓ | 1.9229|± |N/A |
| | |none | 0|word_perplexity|↓ |32.9958|± |N/A |
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------|------:|------|-----:|---------------|---|------:|---|------|
|wikitext| 2|none | 0|bits_per_byte |↓ | 0.9433|± |N/A |
| | |none | 0|byte_perplexity|↓ | 1.9229|± |N/A |
| | |none | 0|word_perplexity|↓ |32.9958|± |N/A |
Signed-off-by: Sun, Xuehao <[email protected]>
/azp run Unit-Test-HPU-AutoRound |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: Sun, Xuehao <[email protected]>
Signed-off-by: Sun, Xuehao <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
The expected behavior is to build a binary for HPU-only, as we did for INC 3x intel/neural-compressor#1336.
Usage
pip install auto-round[hpu] # It will install auto-round with `requirements-hpu.txt`
python setup.py install hpu # Within the gaudi docker, install hpu-only version by default python setup.py install
cc @thuang6
Test scope:
To ensure that new PRs don't break this version, we need to define a set of smoke tests for it. Additionally, we need a runner that installs the HPU-only version and runs these tests.