Unify the woq config weight_dtype for int4 and fp4 on different devices #1594

PenghuiCheng · 2024-06-05T06:19:43Z

Type of Change

bug fix
No API changed

Description

Unify the woq config weight_dtype for int4 and fp4 on different devices

How has this PR been tested?

Local tested

Signed-off-by: Cheng, Penghui <[email protected]>

github-actions · 2024-06-05T06:20:11Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow

Check ID	Status
format-scan (pylint)	success	✅
format-scan (bandit)	success	✅
format-scan (cloc)	success	✅
format-scan (cpplint)	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Optimize Unit Test workflow

Check ID	Status
optimize-unit-test-baseline	success	✅
optimize-unit-test-PR-test	success	✅
Genreate-OptimizeUT-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, tests/CI/test_weight_only.py, tests/CI/test_weight_only_gpu.py.

🟢 NeuralChat Unit Test

Check ID	Status
neuralchat-unit-test-baseline	success	✅
neuralchat-unit-test-PR-test	success	✅
Generate-NeuralChat-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Engine Unit Test workflow

Check ID	Status
engine-unit-test-baseline	success	✅
engine-unit-test-PR-test	success	✅
Genreate-Engine-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Chat Bot Test workflow

Check ID	Status	Error details
call-inference-llama-2-7b-chat-hf / inference test	success		✅
call-inference-mpt-7b-chat / inference test	success		✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.

Unify the woq config weight_dtype for int4 and fp4 on different devices

43d2e34

Signed-off-by: Cheng, Penghui <[email protected]>

PenghuiCheng requested a review from VincyZhang as a code owner June 5, 2024 06:19

PenghuiCheng requested review from changwangss and XuehaoSun June 5, 2024 06:19

PenghuiCheng mentioned this pull request Jun 5, 2024

Cannot finish FP4 quantization: RuntimeError: Qbits: only support Integer WOQ in PACKQ #1577

Closed

changwangss approved these changes Jun 5, 2024

View reviewed changes

kevinintel approved these changes Jun 7, 2024

View reviewed changes

kevinintel merged commit 8722443 into main Jun 7, 2024
21 checks passed

kevinintel deleted the penghuic/woq_config_unify branch June 7, 2024 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify the woq config weight_dtype for int4 and fp4 on different devices #1594

Unify the woq config weight_dtype for int4 and fp4 on different devices #1594

PenghuiCheng commented Jun 5, 2024

github-actions bot commented Jun 5, 2024 •

edited

Loading

Unify the woq config weight_dtype for int4 and fp4 on different devices #1594

Unify the woq config weight_dtype for int4 and fp4 on different devices #1594

Conversation

PenghuiCheng commented Jun 5, 2024

Type of Change

Description

How has this PR been tested?

github-actions bot commented Jun 5, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

github-actions bot commented Jun 5, 2024 •

edited

Loading