Refactor Quantizer for reusing in QAT #9276

kevalmorabia97 · 2024-05-22T10:41:15Z

What does this PR do ?

This PR refactors the Quantizer class used for PTQ so it can be reused for QAT as well (follow-up PR)

Note that functionally wise it is same as before

Collection: N/A

Changelog

N/A

Usage

N/A

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation
Refactoring existing feature code

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

examples/nlp/language_modeling/conf/megatron_quantization.yaml

nemo/export/quantize/quantizer.py

examples/nlp/language_modeling/conf/megatron_quantization.yaml

kevalmorabia97 · 2024-05-30T10:36:23Z

nemo/export/quantize/quantizer.py

            model_cfg.activations_checkpoint_method = None
            model_cfg.activations_checkpoint_granularity = None


Can we remove this disable of activation checkpointing as well? For the SFT model there is no accuracy regression even if we enable activation checkpointing plus that reduces GPU memory requirement

I'm not sure regarding this, see also related examples in #9276 (comment). Maybe you could ask internally?

So in the end it turns out that it's OK to remove setting these to None?:

model_cfg.activations_checkpoint_method = None model_cfg.activations_checkpoint_granularity = None

I hope it will not blow up in some large tests

Users always have the option to disable it in the yaml if they want to, but we should not restrict them to not be able to use it as it allows reducing GPU memory requirement

I will set them as None in the yaml file so there is no difference than previous version

janekl · 2024-06-03T14:07:00Z

Could you demostrate/link how do you want to use that refactored Quantizer?

nemo/export/quantize/quantizer.py

janekl · 2024-06-03T14:45:32Z

Note that changes will need to be unstreamed to another project as well here https://github.com/NVIDIA/NeMo-Framework-Launcher/blob/main/launcher_scripts/conf/ptq/model/quantization.yaml. I think we'll need to chop this slightly

examples/nlp/language_modeling/megatron_quantization.py

kevalmorabia97 · 2024-06-03T16:54:19Z

Could you demostrate/link how do you want to use that refactored Quantizer?

Its used in #9276

janekl · 2024-06-12T07:43:57Z

We'll need to do follow up MRs to:

https://github.com/NVIDIA/NeMo-Framework-Launcher/blob/main/launcher_scripts/conf/ptq/model/quantization.yaml
https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/llm_ptq/nemo_ptq.py
(I know it's duplication but it is what it is)

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 · 2024-06-13T06:19:02Z

@janekl should I rename alpha to sq_alpha so it is clear alpha is used only for SQ?

janekl · 2024-06-13T06:21:10Z

@janekl should I rename alpha to sq_alpha so it is clear alpha is used only for SQ?

OK, good idea (same applies to Launcher ofc.)

nemo/export/quantize/quantizer.py

examples/nlp/language_modeling/megatron_gpt_quantization.py

nemo/export/quantize/quantizer.py

Signed-off-by: Keval Morabia <[email protected]>

examples/nlp/language_modeling/conf/megatron_gpt_quantization.yaml

Signed-off-by: Keval Morabia <[email protected]>

* Refactor Quantizer for reusing in QAT Signed-off-by: Keval Morabia <[email protected]> * Address more reviewer comments Signed-off-by: Keval Morabia <[email protected]> * update yaml config Signed-off-by: Keval Morabia <[email protected]> --------- Signed-off-by: Keval Morabia <[email protected]>

github-actions bot added NLP CI labels May 22, 2024

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from bcbfb74 to 6d01179 Compare May 22, 2024 10:56

janekl added the Run CICD label May 22, 2024

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from 8742cc7 to 16ceb91 Compare May 22, 2024 11:37

kevalmorabia97 commented May 22, 2024

View reviewed changes

examples/nlp/language_modeling/conf/megatron_quantization.yaml Show resolved Hide resolved

kevalmorabia97 commented May 22, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Show resolved Hide resolved

kevalmorabia97 commented May 22, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Show resolved Hide resolved

kevalmorabia97 commented May 22, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Show resolved Hide resolved

kevalmorabia97 commented May 22, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 16ceb91 to 901c88f Compare May 22, 2024 11:55

kevalmorabia97 commented May 22, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from 96fde3b to deba9b9 Compare May 22, 2024 12:50

kevalmorabia97 commented May 24, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

kevalmorabia97 mentioned this pull request May 28, 2024

Add ModelOpt QAT example for Llama2 SFT model #9326

Merged

8 tasks

kevalmorabia97 commented May 29, 2024

View reviewed changes

examples/nlp/language_modeling/conf/megatron_quantization.yaml Outdated Show resolved Hide resolved

ericharper requested a review from janekl May 29, 2024 18:47

kevalmorabia97 commented May 30, 2024

View reviewed changes

janekl reviewed Jun 3, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

janekl reviewed Jun 3, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

janekl reviewed Jun 3, 2024

View reviewed changes

examples/nlp/language_modeling/megatron_quantization.py Outdated Show resolved Hide resolved

kevalmorabia97 requested a review from janekl June 5, 2024 07:03

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 138732e to 7ed3d26 Compare June 12, 2024 19:38

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 7ed3d26 to b33daad Compare June 12, 2024 19:42

Refactor Quantizer for reusing in QAT

e16e82a

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from b33daad to e16e82a Compare June 12, 2024 19:55

kevalmorabia97 mentioned this pull request Jun 12, 2024

Update PTQ config NVIDIA/NeMo-Framework-Launcher#354

Merged

janekl reviewed Jun 13, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

janekl reviewed Jun 13, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

janekl reviewed Jun 13, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Outdated Show resolved Hide resolved

janekl reviewed Jun 13, 2024

View reviewed changes

examples/nlp/language_modeling/megatron_gpt_quantization.py Show resolved Hide resolved

janekl reviewed Jun 13, 2024

View reviewed changes

nemo/export/quantize/quantizer.py Show resolved Hide resolved

Address more reviewer comments

28caca5

Signed-off-by: Keval Morabia <[email protected]>

kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 7f9d4f2 to 28caca5 Compare June 13, 2024 07:10

janekl reviewed Jun 13, 2024

View reviewed changes

examples/nlp/language_modeling/conf/megatron_gpt_quantization.yaml Outdated Show resolved Hide resolved

kevalmorabia97 added 2 commits June 13, 2024 04:25

update yaml config

8b1dd34

Signed-off-by: Keval Morabia <[email protected]>

Merge branch 'main' into kmorabia/update-quantizer-for-qat

3ba182b

janekl added Run CICD and removed Run CICD labels Jun 13, 2024

janekl approved these changes Jun 14, 2024

View reviewed changes

janekl merged commit 3f7e828 into NVIDIA:main Jun 14, 2024
109 checks passed

kevalmorabia97 deleted the kmorabia/update-quantizer-for-qat branch July 15, 2024 22:10

ko3n1g mentioned this pull request Jul 18, 2024

Release 2.0.0rc1 #9786

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Quantizer for reusing in QAT #9276

Refactor Quantizer for reusing in QAT #9276

kevalmorabia97 commented May 22, 2024 •

edited

Loading

kevalmorabia97 May 30, 2024

janekl Jun 3, 2024

janekl Jun 13, 2024

kevalmorabia97 Jun 13, 2024

kevalmorabia97 Jun 13, 2024

janekl commented Jun 3, 2024

janekl commented Jun 3, 2024

kevalmorabia97 commented Jun 3, 2024

janekl commented Jun 12, 2024

kevalmorabia97 commented Jun 13, 2024

janekl commented Jun 13, 2024

		model_cfg.activations_checkpoint_method = None
		model_cfg.activations_checkpoint_granularity = None

Refactor Quantizer for reusing in QAT #9276

Refactor Quantizer for reusing in QAT #9276

Conversation

kevalmorabia97 commented May 22, 2024 • edited Loading

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

kevalmorabia97 May 30, 2024

Choose a reason for hiding this comment

janekl Jun 3, 2024

Choose a reason for hiding this comment

janekl Jun 13, 2024

Choose a reason for hiding this comment

kevalmorabia97 Jun 13, 2024

Choose a reason for hiding this comment

kevalmorabia97 Jun 13, 2024

Choose a reason for hiding this comment

janekl commented Jun 3, 2024

janekl commented Jun 3, 2024

kevalmorabia97 commented Jun 3, 2024

janekl commented Jun 12, 2024

kevalmorabia97 commented Jun 13, 2024

janekl commented Jun 13, 2024

kevalmorabia97 commented May 22, 2024 •

edited

Loading