Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added AlphaDropout which is used in SNNs. #656

Merged
merged 10 commits into from
Mar 7, 2019
1 change: 1 addition & 0 deletions docs/src/models/layers.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,5 +50,6 @@ These layers don't affect the structure of the network but may improve training
Flux.testmode!
BatchNorm
Dropout
AlphaDropout
LayerNorm
```
2 changes: 1 addition & 1 deletion src/Flux.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ using MacroTools, Juno, Requires, Reexport, Statistics, Random
using MacroTools: @forward

export Chain, Dense, RNN, LSTM, GRU, Conv, ConvTranspose, MaxPool, MeanPool,
DepthwiseConv, Dropout, LayerNorm, BatchNorm,
DepthwiseConv, Dropout, AlphaDropout, LayerNorm, BatchNorm,
params, mapleaves, cpu, gpu, f32, f64

@reexport using NNlib
Expand Down
29 changes: 29 additions & 0 deletions src/layers/normalise.jl
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,35 @@ end

_testmode!(a::Dropout, test) = (a.active = !test)

"""
AlphaDropout(p)
A dropout layer. It is used in Self-Normalizing Neural Networks.
(https://papers.nips.cc/paper/6698-self-normalizing-neural-networks.pdf)
The AlphaDropout layer ensures that mean and variance of activations remains the same as before.
"""
mutable struct AlphaDropout{F}
p::F
active::Bool
staticfloat marked this conversation as resolved.
Show resolved Hide resolved
end

function AlphaDropout(p)
@assert 0 ≤ p ≤ 1
AlphaDropout{typeof(p)}(p,true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need {typeof(p)} here; it should be able to determine this from the p parameter:

julia> struct Foo{F}
       p::F
       active::Bool
       end

julia> Foo(1.0, true)
Foo{Float64}(1.0, true)

end

function (a::AlphaDropout)(x)
a.active || return x
α = eltype(x)(-1.75813631)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the same α parameter as given by the selu() function in NNlib, see the definition here: https://github.com/FluxML/NNlib.jl/blob/d07ac0bfd3c71c3a29bc9c22becbba19227bbeb5/src/activation.jl#L100-L104

noise = randn(eltype(x), size(x))
x = @. x*(noise .> (1 - a.p)) + α .* (noise .<= (1 - a.p))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you use @. you don't need the .> and .* and .<= operators; either use @. to get them all, or manually write them all as dotted operators.

A = (a.p + a.p * (1 - a.p) * α ^ 2)^0.5
B = -A * α * (1 - a.p)
x = @. A .* x .+ B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here; this @. is not doing anything, because you have already dotted your operators.

return x
end

_testmode!(a::AlphaDropout, test) = (a.active = !test)

"""
LayerNorm(h::Integer)

Expand Down