Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Quantizer for reusing in QAT #9276

Merged
merged 4 commits into from
Jun 14, 2024

Conversation

kevalmorabia97
Copy link
Collaborator

@kevalmorabia97 kevalmorabia97 commented May 22, 2024

What does this PR do ?

This PR refactors the Quantizer class used for PTQ so it can be reused for QAT as well (follow-up PR)

Note that functionally wise it is same as before

Collection: N/A

Changelog

  • N/A

Usage

  • N/A

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation
  • Refactoring existing feature code

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from bcbfb74 to 6d01179 Compare May 22, 2024 10:56
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from 8742cc7 to 16ceb91 Compare May 22, 2024 11:37
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 16ceb91 to 901c88f Compare May 22, 2024 11:55
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch 2 times, most recently from 96fde3b to deba9b9 Compare May 22, 2024 12:50
@ericharper ericharper requested a review from janekl May 29, 2024 18:47
Comment on lines 110 to 111
model_cfg.activations_checkpoint_method = None
model_cfg.activations_checkpoint_granularity = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this disable of activation checkpointing as well? For the SFT model there is no accuracy regression even if we enable activation checkpointing plus that reduces GPU memory requirement

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure regarding this, see also related examples in #9276 (comment). Maybe you could ask internally?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the end it turns out that it's OK to remove setting these to None?:

model_cfg.activations_checkpoint_method = None
model_cfg.activations_checkpoint_granularity = None

I hope it will not blow up in some large tests

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users always have the option to disable it in the yaml if they want to, but we should not restrict them to not be able to use it as it allows reducing GPU memory requirement

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will set them as None in the yaml file so there is no difference than previous version

@janekl
Copy link
Collaborator

janekl commented Jun 3, 2024

Could you demostrate/link how do you want to use that refactored Quantizer?

@janekl
Copy link
Collaborator

janekl commented Jun 3, 2024

Note that changes will need to be unstreamed to another project as well here https://github.com/NVIDIA/NeMo-Framework-Launcher/blob/main/launcher_scripts/conf/ptq/model/quantization.yaml. I think we'll need to chop this slightly

@kevalmorabia97
Copy link
Collaborator Author

Could you demostrate/link how do you want to use that refactored Quantizer?

Its used in #9276

@kevalmorabia97 kevalmorabia97 requested a review from janekl June 5, 2024 07:03
@janekl
Copy link
Collaborator

janekl commented Jun 12, 2024

We'll need to do follow up MRs to:

@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 138732e to 7ed3d26 Compare June 12, 2024 19:38
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 7ed3d26 to b33daad Compare June 12, 2024 19:42
@kevalmorabia97
Copy link
Collaborator Author

@janekl should I rename alpha to sq_alpha so it is clear alpha is used only for SQ?

@janekl
Copy link
Collaborator

janekl commented Jun 13, 2024

@janekl should I rename alpha to sq_alpha so it is clear alpha is used only for SQ?

OK, good idea (same applies to Launcher ofc.)

Signed-off-by: Keval Morabia <[email protected]>
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/update-quantizer-for-qat branch from 7f9d4f2 to 28caca5 Compare June 13, 2024 07:10
@janekl janekl added Run CICD and removed Run CICD labels Jun 13, 2024
@janekl janekl merged commit 3f7e828 into NVIDIA:main Jun 14, 2024
109 checks passed
JesusPaz pushed a commit to JesusPaz/NeMo that referenced this pull request Jun 18, 2024
* Refactor Quantizer for reusing in QAT

Signed-off-by: Keval Morabia <[email protected]>

* Address more reviewer comments

Signed-off-by: Keval Morabia <[email protected]>

* update yaml config

Signed-off-by: Keval Morabia <[email protected]>

---------

Signed-off-by: Keval Morabia <[email protected]>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Refactor Quantizer for reusing in QAT

Signed-off-by: Keval Morabia <[email protected]>

* Address more reviewer comments

Signed-off-by: Keval Morabia <[email protected]>

* update yaml config

Signed-off-by: Keval Morabia <[email protected]>

---------

Signed-off-by: Keval Morabia <[email protected]>
@kevalmorabia97 kevalmorabia97 deleted the kmorabia/update-quantizer-for-qat branch July 15, 2024 22:10
@ko3n1g ko3n1g mentioned this pull request Jul 18, 2024
2 tasks
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
* Refactor Quantizer for reusing in QAT

Signed-off-by: Keval Morabia <[email protected]>

* Address more reviewer comments

Signed-off-by: Keval Morabia <[email protected]>

* update yaml config

Signed-off-by: Keval Morabia <[email protected]>

---------

Signed-off-by: Keval Morabia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants