Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer Compression using SliceGPT #1052

Merged
merged 1 commit into from
Apr 9, 2024
Merged

Transformer Compression using SliceGPT #1052

merged 1 commit into from
Apr 9, 2024

Conversation

shaahji
Copy link
Contributor

@shaahji shaahji commented Apr 4, 2024

Transformer Compression using SliceGPT

Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint.
Updated phi2 example with a new workflow that uses the implemented pass.

Release Note: New pass SliceGPT to compress transformer to improve performance and reduce memory footprint.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

@shaahji shaahji marked this pull request as draft April 4, 2024 16:47
@shaahji shaahji force-pushed the shaahji/slicegpt branch from 1b1a556 to 450a963 Compare April 4, 2024 21:21
@shaahji shaahji changed the title Introducing SliceGPT pass Transformer Compression using SliceGPT Apr 4, 2024
@shaahji shaahji marked this pull request as ready for review April 4, 2024 21:23
@shaahji shaahji force-pushed the shaahji/slicegpt branch from 450a963 to 28b0060 Compare April 4, 2024 21:59
docs/source/features/passes/pytorch.md Outdated Show resolved Hide resolved
examples/phi2/phi2_slicegpt.json Outdated Show resolved Hide resolved
@shaahji shaahji force-pushed the shaahji/slicegpt branch from 28b0060 to b3f02de Compare April 5, 2024 07:31
examples/phi2/phi2.py Fixed Show fixed Hide fixed
@shaahji shaahji force-pushed the shaahji/slicegpt branch 2 times, most recently from d069691 to 147545d Compare April 5, 2024 08:11
@shaahji shaahji force-pushed the shaahji/slicegpt branch 6 times, most recently from 31cd97a to 88556c5 Compare April 9, 2024 07:04
devang-ml
devang-ml previously approved these changes Apr 9, 2024
examples/phi2/phi2.py Outdated Show resolved Hide resolved
examples/phi2/phi2.py Outdated Show resolved Hide resolved
Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint.
Updated phi2 example with a new workflow that uses the implemented pass.
@staticmethod
def _default_config(accelerator_spec: AcceleratorSpec) -> Dict[str, PassConfigParam]:
return {
"calibration_data_config": PassConfigParam(
Copy link
Contributor

@jambayk jambayk Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the data_config.name is used from this param. Like we discussed offline previously, wouldn't it be simpler to just have string param called calibration_dataset_name or something?

The actual data config is not used at all. Having it here makes the config unnecessarily complicated. It also makes it appear any data config is supported when only three data names are supported by the tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will follow up on this once I have a discussion with Devang. I think a full data_config keeps the option open for future extension but again it can always be changed in future. Let me follow up after discussion.

@shaahji shaahji merged commit ffd1d8f into main Apr 9, 2024
33 checks passed
@shaahji shaahji deleted the shaahji/slicegpt branch April 9, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants