Fix llama sin_cached/cos_cached backward compatibility #29299

fxmarty · 2024-02-26T15:50:02Z

The _sin_cached & _cos_cached are never set in the init (compare to https://github.com/huggingface/transformers/blob/v4.37.2/src/transformers/models/llama/modeling_llama.py#L134-L136), which yields errors in external packages as backward compatibility is broken (e.g. in https://github.com/AutoGPTQ/AutoGPTQ/blob/6b55300dd83326504ee6e02b730fa4451adfa479/auto_gptq/modeling/_utils.py#L95-L96)

IMO this should be in a patch release.

HuggingFaceDocBuilderDev · 2024-02-26T16:09:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts · 2024-02-26T16:33:32Z

src/transformers/models/llama/modeling_llama.py

@@ -100,6 +100,21 @@ def __init__(self, dim, max_position_embeddings=2048, base=10000, device=None):
        inv_freq = 1.0 / (self.base ** (torch.arange(0, self.dim, 2, dtype=torch.int64).float().to(device) / self.dim))
        self.register_buffer("inv_freq", inv_freq, persistent=False)

+        # TODO: Remove in 4.40.


why 4.40 here? This kind of version dependant removal would be for deprecation of a feature, but AFAICT in the PR comment we don't have an implemented fix which replaces this

@amyeroberts I just followed 7d312ad The sin_cached attribute will be removed in 4.40. cc @gante

huh - OK. Won't this means things still break though?

I don't think we can remove them, no 💔

gante · 2024-02-26T17:18:50Z

@fxmarty The extent of the fix may depend on the following question: are the libraries downstream broken because a) of the lack of the tensors, or because b) the lack of the tensors AND their values?

The PR as it stands would fix a), but it probably wouldn't fix b).

Full story of how this came to be:

Static cache removed these tensors
This PR added them, but a) NOT at init time (which this PR would fix as of the latest commit) and b) the contents and shape of the tensors are different

Related PR: #29198 (which tries to fix #29173)

gante · 2024-02-26T17:20:12Z

Note to ourselves: non-permanent buffers can't be treated as common variables for deprecation purposes 😬

ArthurZucker · 2024-02-26T23:27:37Z

#29198 will add them at init time.
We keep the values but we deprecate fast because the values are not longer updated.

ArthurZucker

Let's not duplicate the work

fxmarty · 2024-02-27T09:46:50Z

Was not aware #29198 was a fix for that, nice! Note that with self.register_buffer("_cos_cached", ...) I had some torch.fx tests failing.

ArthurZucker · 2024-02-27T09:49:27Z

Feel free to comment over there

fix bc

8358977

fxmarty requested review from amyeroberts and ArthurZucker February 26, 2024 15:50

fxmarty mentioned this pull request Feb 26, 2024

Explicitely check compute capability in marlin's QLinear AutoGPTQ/AutoGPTQ#567

Merged

make test pass

0947ca2

amyeroberts reviewed Feb 26, 2024

View reviewed changes

ArthurZucker reviewed Feb 27, 2024

View reviewed changes

fxmarty closed this Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix llama sin_cached/cos_cached backward compatibility #29299

Fix llama sin_cached/cos_cached backward compatibility #29299

fxmarty commented Feb 26, 2024

HuggingFaceDocBuilderDev commented Feb 26, 2024

amyeroberts Feb 26, 2024

fxmarty Feb 26, 2024 •

edited

Loading

amyeroberts Feb 26, 2024

gante Feb 26, 2024

gante commented Feb 26, 2024

gante commented Feb 26, 2024

ArthurZucker commented Feb 26, 2024

ArthurZucker left a comment

fxmarty commented Feb 27, 2024 •

edited

Loading

ArthurZucker commented Feb 27, 2024

Fix llama sin_cached/cos_cached backward compatibility #29299

Fix llama sin_cached/cos_cached backward compatibility #29299

Conversation

fxmarty commented Feb 26, 2024

HuggingFaceDocBuilderDev commented Feb 26, 2024

amyeroberts Feb 26, 2024

Choose a reason for hiding this comment

fxmarty Feb 26, 2024 • edited Loading

Choose a reason for hiding this comment

amyeroberts Feb 26, 2024

Choose a reason for hiding this comment

gante Feb 26, 2024

Choose a reason for hiding this comment

gante commented Feb 26, 2024

gante commented Feb 26, 2024

ArthurZucker commented Feb 26, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

fxmarty commented Feb 27, 2024 • edited Loading

ArthurZucker commented Feb 27, 2024

fxmarty Feb 26, 2024 •

edited

Loading

fxmarty commented Feb 27, 2024 •

edited

Loading