Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue of canine forward requiring input_ids anyway #26290

Merged
merged 9 commits into from
Oct 2, 2023
4 changes: 3 additions & 1 deletion src/transformers/models/canine/modeling_canine.py
Original file line number Diff line number Diff line change
Expand Up @@ -1169,7 +1169,9 @@ def forward(
# Contextualize character embeddings using shallow Transformer.
# We use a 3D attention mask for the local attention.
# `input_char_encoding`: shape (batch_size, char_seq_len, char_dim)
char_attention_mask = self._create_3d_attention_mask_from_input_mask(input_ids, attention_mask)
char_attention_mask = self._create_3d_attention_mask_from_input_mask(
input_ids if input_ids is not None else inputs_embeds, attention_mask
)
Comment on lines +1172 to +1174
Copy link
Collaborator

@ArthurZucker ArthurZucker Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use the input_shape defined above, since the function is only used once, no problem changing it
I'm pretty sure we have tests in our CI for that make sure generation work with use cache and in that case, you need this to work. Good for me as is

init_chars_encoder_outputs = self.initial_char_encoder(
input_char_embeddings,
attention_mask=char_attention_mask,
Expand Down