You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Loading a finetuned model's safetensors with device_map=auto, I get a warning that the tied embeddings are not initialized.
Some weights of MBartForConditionalGeneration were not initialized from the model checkpoint and are newly initialized: ['model.decoder.embed_tokens.weight', 'model.encoder.embed_tokens.weight']
I finetuned an MBART model with the trainer, use_safetensors was set to True. The model's vocabulary (embedding size) was extended, maybe that matters.
fromtransformersimportAutoModelForSeq2SeqLM# Works:model=AutoModelForSeq2SeqLM.from_pretrained("BramVanroy/mbart_test")
# Does not work (tied embeddings not loaded correctly - triggers a warning)model=AutoModelForSeq2SeqLM.from_pretrained("BramVanroy/mbart_test", device_map="auto")
It is not just the warning, it seems actually the case that the weights are not loaded correctly (random output).
Expected behavior
Correctly loaded safetensors
The text was updated successfully, but these errors were encountered:
I've traced this to the definition of the find_tied_parameters function from accelerate.
In transformers we have a control flow to identify tied parameters that does not work for meta tensors; if we identify a meta tensor, we then rely on the find_tied_parameters function from accelerate. There seems to be a discrepancy in the number of layers returned by these two methods here, depending on whether we'reusing a device_map or not:
System Info
Current master
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Loading a finetuned model's safetensors with device_map=auto, I get a warning that the tied embeddings are not initialized.
I finetuned an MBART model with the trainer, use_safetensors was set to True. The model's vocabulary (embedding size) was extended, maybe that matters.
It is not just the warning, it seems actually the case that the weights are not loaded correctly (random output).
Expected behavior
Correctly loaded safetensors
The text was updated successfully, but these errors were encountered: