-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add elu and selu activations #263
base: master
Are you sure you want to change the base?
Conversation
this should be ready for merge after review |
Shouldn't elu and selu be primitives if we add them to Knet? |
You mean Autograd primitives? It depends. If they are defined generically
using other primitives, no need. If they are type specific and/or there are
performance concerns then yes.
…On Wed, Feb 14, 2018 at 12:03 cangumeli ***@***.***> wrote:
Shouldn't elu and selu be primitives if we add them to Knet?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#263 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABvNpuzRtYTzzbZjSIeenINAodpCZKWRks5tUqFxgaJpZM4RuYY1>
.
|
If |
src/unary.jl
Outdated
p = relu(x) | ||
m = -relu(-x) | ||
return scale*(p + alpha*(exp(m) - 1)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using elu
here?
function selu(x)
alpha = Float32(1.6732632)
scale = Float32(1.0507009)
return scale * elu(x, alpha)
end
Elu can be included in |
ab3a189
to
a791dc1
Compare
@@ -32,6 +32,7 @@ broadcast_ops = [ | |||
# "fdim", | |||
("invxback","invxback","(-xi*yi*yi)"), | |||
("reluback","reluback","(yi>0?xi:0)"), | |||
("eluback", "eluback", "ifelse(yi>0,dyi,yi+1)"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part turns to a cuda code snippet. So, I think it must be replaced with the following:
("eluback", "eluback", "(yi>0?xi:yi+1)")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the comment. ifelse(yi>0,dyi,yi+1)
is valid julia code and should be the right derivative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I tried to build the package, it threw following errors:
cuda01.cu(395): error: identifier "dyi" is undefined
cuda01.cu(395): error: identifier "ifelse" is undefined
cuda01.cu(408): error: identifier "dyi" is undefined
cuda01.cu(408): error: identifier "ifelse" is undefined
I am able to build the package with the ("eluback", "eluback", "(yi>0?xi:yi+1)")
code snippet.
Those snippets are not Julia code they are Cuda.
…On Tue, Apr 17, 2018, 16:04 Ozan Arkan Can ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/broadcast.jl
<#263 (comment)>:
> @@ -32,6 +32,7 @@ broadcast_ops = [
# "fdim",
("invxback","invxback","(-xi*yi*yi)"),
("reluback","reluback","(yi>0?xi:0)"),
+ ("eluback", "eluback", "ifelse(yi>0,dyi,yi+1)"),
When I tried to build the package, it threw following errors:
cuda01.cu(395): error: identifier "dyi" is undefinedcuda01.cu(395): error: identifier "ifelse" is undefinedcuda01.cu(408): error: identifier "dyi" is undefinedcuda01.cu(408): error: identifier "ifelse" is undefined
I am able to build the package with the ("eluback", "eluback",
"(yi>0?xi:yi+1)") code snippet.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#263 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABvNpl9qdn6jpfz7P8I2N58e4mDjfYtqks5tpeh3gaJpZM4RuYY1>
.
|
Manually merged elu after fixing the faulty gradient: for negative values the derivative should be dyi*(yi+1). |
Added selu as a cuda kernel for efficiency. |
@CarloLucibello can you explain how this gives the intended result in alpha_dropout: |
No description provided.