-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[operator] Add Mish Activation Function #20320
Conversation
Hey @Adnios , Thanks for submitting the PR
CI supported jobs: [centos-gpu, unix-cpu, windows-gpu, miscellaneous, clang, website, windows-cpu, sanity, edge, centos-cpu, unix-gpu] Note: |
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
3a11899
to
47a52f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! Going forward in 2.0, as we will mainly use np/npx instead of sym, could you also include a test for it with npx.activation?
Sure. Thanks for advice. |
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
@mxnet-bot run ci [centos-gpu, unix-gpu] |
Jenkins CI successfully triggered : [centos-gpu, unix-gpu] |
@szha Please help review |
@Adnios merged. Thank you for the contribution! |
template <typename DType> | ||
__device__ inline DType mish(const DType val) { | ||
if (type_util::has_double_or_integral<DType>::value) { | ||
return val * ::tanh(::log(1 + ::exp(val))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that could be improved here (I did not notice this PR earlier, sorry for a late feedback) is the numerical stability of the softrelu part - see the implementation of the softrelu (it switches to softrelu(x) = x for large values of x to avoid overflow). @Adnios could you open another PR changing e.g. this function to
return val * op::tanh(op::softrelu(val));
(the double vs float is handled in op::tanh and op::softrelu anyway so this one will also be simpler as a result) and similarly backward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed, usually Softplus has an upper bound of 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thanks for your advice.
Description
Add Mish Activation Function.
Related issus: #16841
The pr(#17696) seem to be dead.
Checklist
Essentials
Changes
Comments