Fix regression loading dtype #34409

SunMarc · 2024-10-25T10:35:53Z

What does this PR do?

This PR fixes the regression that @BenjaminBossan found out. It was caused by this PR. Basically, old_param was always going to be None with the current logic. Hence, we don't set the correct dtype to the parameter param = param.to(old_param.dtype). This caused a dtype mismatch with torchao.

[...]
    return torch.nn.functional.linear(input_tensor, weight_tensor, bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: self and mat2 must have the same dtype, but got Float and Half

To reproduce

import torch
from transformers import AutoModelForCausalLM, TorchAoConfig

quantization_config = TorchAoConfig(quant_type="int8_dynamic_activation_int8_weight")
model_id = "facebook/opt-125m"
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config)
inputs = torch.arange(10).view(-1, 1)
model(inputs)

I tested that hqq serialization and the new test that I added both passed

BenjaminBossan

I don't have enough background to judge if the fix is good, but the test should cover it, as the generate call itself would raise an error without the fix.

HuggingFaceDocBuilderDev · 2024-10-25T11:02:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks, noting as it will be in a patch

ArthurZucker · 2024-10-28T10:34:37Z

src/transformers/modeling_utils.py

+            # We shouldn't hit the default value unless for quant methods like hqq that modifies expected_keys.
+            old_param = getattr(old_param, split, None)


😅 I knew it!
It was a bit too specific

* fix regression * add test for torchao * expected output * better fix

SunMarc and others added 5 commits October 25, 2024 12:05

fix regression

0b89517

add test for torchao

faf8205

expected output

9d59f16

better fix

0ffbf4c

Merge branch 'main' into fix-regression-loading

282cfd8

SunMarc requested review from ArthurZucker and BenjaminBossan October 25, 2024 10:36

BenjaminBossan approved these changes Oct 25, 2024

View reviewed changes

Merge branch 'main' into fix-regression-loading

46678f9

ArthurZucker approved these changes Oct 28, 2024

View reviewed changes

ArthurZucker merged commit 004530a into main Oct 29, 2024
23 of 27 checks passed

ArthurZucker deleted the fix-regression-loading branch October 29, 2024 10:41

ArthurZucker pushed a commit that referenced this pull request Oct 29, 2024

Fix regression loading dtype (#34409)

94ed13c

* fix regression * add test for torchao * expected output * better fix

maxjeblick mentioned this pull request Oct 29, 2024

Example code is not working illuin-tech/colpali#107

Closed

ManuelFay mentioned this pull request Oct 29, 2024

Fix dtype illuin-tech/colpali#118

Merged

2015aroras pushed a commit to 2015aroras/transformers that referenced this pull request Nov 15, 2024

Fix regression loading dtype (huggingface#34409)

615b94d

* fix regression * add test for torchao * expected output * better fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix regression loading dtype #34409

Fix regression loading dtype #34409

SunMarc commented Oct 25, 2024

BenjaminBossan left a comment

HuggingFaceDocBuilderDev commented Oct 25, 2024

ArthurZucker left a comment

ArthurZucker Oct 28, 2024

		# We shouldn't hit the default value unless for quant methods like hqq that modifies expected_keys.
		old_param = getattr(old_param, split, None)

Fix regression loading dtype #34409

Fix regression loading dtype #34409

Conversation

SunMarc commented Oct 25, 2024

What does this PR do?

BenjaminBossan left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 25, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Oct 28, 2024

Choose a reason for hiding this comment