Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove torch_dtype override #25894

Merged
merged 3 commits into from
Aug 31, 2023
Merged

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Aug 31, 2023

What does this PR do ?

Fixes #25888 . This PR removes the override of torch_dtype with gptq quantization. This allows more flexibility for the user with their model don't work in fp16.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense thanks! This should preserve the default behaviour so no need to update the slow tests!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot better thanks!

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 31, 2023

The documentation is not available anymore as the PR was closed or merged.

@SunMarc SunMarc merged commit ef10dbc into huggingface:main Aug 31, 2023
@SunMarc SunMarc deleted the torch_dtype_gptq branch August 31, 2023 21:38
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* remove torch_dtype override

* style

* Update src/transformers/modeling_utils.py

Co-authored-by: Arthur <[email protected]>

---------

Co-authored-by: Arthur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPTQ Quantization via from_pretrained: why enforcing fp16?
4 participants