-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation: get special tokens from model config #30899
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is the only argument that we need to get from the model's generation config for BC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what went wrong in my previous PR 😭 Thank you for taking this issue @zucchini-nlp!
For full BC, we need further changes. The removed _get_decoder_start_token_id
function used bos_token_id
as a fallback, and bos_token_id
could be loaded from the config (like decoder_start_token_id
). Furthermore, this logic only was used when self.config.is_encoder_decoder is True
.
So the needed changes for BC are:
- first check if
self.config.is_encoder_decoder is True
. - If 1. is true, get BOTH
decoder_start_token_id
andbos_token_id
with fallback toself.generation_config
- if 1. is false, don't use a fallback to
self.generation_config
- The line
decoder_start_token_id = decoder_start_token_id if decoder_start_token_id is not None else bos_token_id
can stay as it is :)
@ArthurZucker see my comment above :D |
@ArthurZucker @gante done, I brought back the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for iterating on the fix 💛
(for a perfect PR, only a test is missing, to ensure we don't repeat the mistake)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving and merging to patch transformers today
elif bos_token_id is not None: | ||
return bos_token_id | ||
else: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return | |
return None |
explicit but I guess it's the same
* fix * let's do this way? * codestyle * update * add tests
* fix * let's do this way? * codestyle * update * add tests
* fix * let's do this way? * codestyle * update * add tests
What does this PR do?
Fixed #30892 . The error, described in the linked issue, was caused by passing a
GenerationConfig
type instead of a kwarg. If a user passesGenerationConfig
we do not update it with model's generation config, thus cannot get special tokens used by the model. Two options were:GenerationConfig
by that of models in the_prepare_generation_config
I think the first option can mess-up with what the user wants when generating, for ex by adding a
do_sample=True
from model's generation config. So I went for option-2