-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect padding_side Setting as 'left' in Llama Family Model #25022
Comments
Hey! Indeed, as it was written in the documentation a padding token is required. Seems that by default the padding side is set to |
Great, I would be nice to update the default padding_side of those model. |
There does not seem to be any documentation regarding what the correct padding_side should be for CodeLLAMA family. Is there a way to find this out ? @ArthurZucker I also opened a related issue here. |
CodeLlama is Llama family so same padding side. I answered on your issue 🤗 |
System Info
transformers
version: 4.30.2Who can help?
text models: @ArthurZucker and @younesbelkada generate: @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
When utilizing the Llama Family Model for batch generation, an issue arises due to the lack of a padding token. To clarify, the original model uses pad_id = -1, implying the absence of a padding token. This logic is infeasible for our scenario.
Here is our proposed solution:
Firstly, a padding token should be added using the command tokenizer.add_special_tokens({"pad_token":""}), following which the token embedding must be resized accordingly. It's essential to also set model.config.pad_token_id. The embed_tokens layer of the model is initialized with self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.config.padding_idx). This ensures that encoding the padding token outputs zeros. Therefore, passing it during initialization is recommended.
Expected behavior
Another important aspect is setting the padding_side to 'right'. This is crucial for correct padding direction.
The text was updated successfully, but these errors were encountered: