-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemma: update activation warning #29995
Conversation
This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the goal is to have everyone set hidden_activation
and not the hidden_act
Yeah, I get it, but the message I got was:
|
Sounds alright to me no? Feel free to make it clearer! |
But
I'm not using a legacy function, and the warning says it's changing it when it isn't. So my suggestion was to either hide the warning (only in this case, when the activation matches what we want), or if we want to encourage the use of |
You are right, let's encourage the use of if config.hidden_activation is None:
logger.warning_once(
"Gemma's activation function should be approximate GeLU and not exact GeLU.\n"
"Changing the activation function to `gelu_pytorch_tanh`."
f"if you want to use the legacy `{config.hidden_act}`, "
f"edit the `model.config` to set `hidden_activation={config.hidden_act}` "
" instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details."
)
hidden_activation = "gelu_pytorch_tanh"
else:
hidden_activation = config.hidden_activation
self.act_fn = ACT2FN[hidden_activation] the |
Yes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
"Gemma's activation function should be approximate GeLU and not exact GeLU.\n" | ||
"Changing the activation function to `gelu_pytorch_tanh`." | ||
f"if you want to use the legacy `{config.hidden_act}`, " | ||
f"edit the `model.config` to set `hidden_activation={config.hidden_act}` " | ||
" instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details." | ||
"`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.\n" | ||
"Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use\n" | ||
"`config.hidden_activation` if you want to override this behaviour.\n" | ||
"See https://github.com/huggingface/transformers/pull/29402 for more details." | ||
) | ||
hidden_activation = "gelu_pytorch_tanh" | ||
else: | ||
hidden_activation = config.hidden_activation | ||
config.hidden_activation = "gelu_pytorch_tanh" | ||
hidden_activation = config.hidden_activation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the config.hidden_activation
is set to None
, it has to be set to gelu_pytorch
. That is what the warning is saying. We force it.
If the user wants to use another activation, he ought to use config.hidden_activation
not config.act_fn. I agree with potentially doing something like
config.hidden_activation = "gelu_pytorch_tanh"` but that should be done in the if, not in the else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow :) This is what the block currently looks like without the diff markup (there's no else
any more):
if config.hidden_activation is None:
logger.warning_once(
"`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.\n"
"Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use\n"
"`config.hidden_activation` if you want to override this behaviour.\n"
"See https://github.com/huggingface/transformers/pull/29402 for more details."
)
config.hidden_activation = "gelu_pytorch_tanh"
hidden_activation = config.hidden_activation
Is that not what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! diff looked wrong!
Forgot about this, merging now. |
* Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`
What does this PR do?
This is a nit PR, but I was confused. I got the warning even after I had changed
hidden_act
togelu_pytorch_tanh
, telling me that I was using the "legacy"gelu_pytorch_tanh
.Another option is to keep the warning but change the message to say something like "
hidden_act
is ignored, please usehidden_activation
instead. Setting Gemma's activation function togelu_pytorch_tanh
".Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.