Added AlphaDropout which is used in SNNs. #656

thebhatman · 2019-03-03T19:36:24Z

No description provided.

MikeInnes · 2019-03-04T14:50:06Z

Seems generally reasonable, but the implementation is currently dropping gradients. You'll need to avoid in-place broadcast to get this working without param and collect.

thebhatman · 2019-03-04T16:48:44Z

Thank You @MikeInnes. I have made the necessary changes so that gradients are not lost. I am no longer using collect and param.

src/layers/normalise.jl

CarloLucibello · 2019-03-05T03:19:58Z

src/layers/normalise.jl

+  a.active || return x
+  α = -1.75813631
+  noise = randn(Float64, size(x.data))
+  x.data .= x.data .* (noise .> (1 - a.p)) + α .* (noise .<= (1 - a.p))


Do not assign to x.data, otherwise you . Just do

x = x* (noise .> (1 - a.p)) + α .* (noise .<= (1 - a.p))

or even better

d = rand(size(x)) .> a.p x = @. x * d + α * !d

CarloLucibello · 2019-03-05T03:20:40Z

src/layers/normalise.jl

+function (a::AlphaDropout)(x)
+  a.active || return x
+  α = -1.75813631
+  noise = randn(Float64, size(x.data))


randn gives normally distributed numbers, maybe you want rand here

I think randn is more suitable here, as we are required to have a uniform normally distributed output.

CarloLucibello · 2019-03-05T03:21:54Z

src/layers/normalise.jl

+
+function (a::AlphaDropout)(x)
+  a.active || return x
+  α = -1.75813631


this is going to change the return value to Float64 irrespective of x. α should be initialized to the same element type of x

thebhatman · 2019-03-05T04:37:03Z

Thank you @CarloLucibello. I have made the changes suggested by you. Just curious, I wanted to know what would have been the problem if we assigned to x.data.

MikeInnes · 2019-03-06T16:42:56Z

This looks generally reasonable to me. Would like a quick look over from @staticfloat that it makes sense.

Just curious, I wanted to know what would have been the problem if we assigned to x.data.

Any time you work with .data you drop gradients; the reason we wrap the data in a TrackedArray is exactly so that we can record what happens to it and get gradients.

MikeInnes · 2019-03-06T16:43:22Z

Also this needs some numerical tests, e.g. comparing hand-coded output from another implementation, or similar.

staticfloat

Please also introduce some numerical tests, preferably calculated by hand or in PyTorch/TF so that we can make sure we never break the numerics here.

staticfloat · 2019-03-06T21:14:52Z

src/layers/normalise.jl

+  a.active || return x
+  α = eltype(x)(-1.75813631)
+  noise = randn(eltype(x), size(x))
+  x = @. x*(noise .> (1 - a.p)) + α .* (noise .<= (1 - a.p))


I think if you use @. you don't need the .> and .* and .<= operators; either use @. to get them all, or manually write them all as dotted operators.

staticfloat · 2019-03-06T21:15:26Z

src/layers/normalise.jl

+
+function (a::AlphaDropout)(x)
+  a.active || return x
+  α = eltype(x)(-1.75813631)


You should use the same α parameter as given by the selu() function in NNlib, see the definition here: https://github.com/FluxML/NNlib.jl/blob/d07ac0bfd3c71c3a29bc9c22becbba19227bbeb5/src/activation.jl#L100-L104

staticfloat · 2019-03-06T21:16:22Z

src/layers/normalise.jl

+
+function AlphaDropout(p)
+  @assert 0 ≤ p ≤ 1
+  AlphaDropout{typeof(p)}(p,true)


You shouldn't need {typeof(p)} here; it should be able to determine this from the p parameter:

julia> struct Foo{F} p::F active::Bool end julia> Foo(1.0, true) Foo{Float64}(1.0, true)

staticfloat · 2019-03-06T21:16:47Z

src/layers/normalise.jl

+  x = @. x*(noise .> (1 - a.p)) + α .* (noise .<= (1 - a.p))
+  A = (a.p + a.p * (1 - a.p) * α ^ 2)^0.5
+  B = -A * α * (1 - a.p)
+  x = @. A .* x .+ B


Same thing here; this @. is not doing anything, because you have already dotted your operators.

thebhatman · 2019-03-07T15:57:05Z

I realized randn(), which I am using to initialize noise, is not giving a standard normal distribution. Due to which, the mean and variance is slightly different from the mean and variance of input. This can be solved by making use of Normal() which is available in Distributions package. But Flux doesn't have a dependency on Distributions. What would be the best way to get a standard normal distribution? Shall I add Distributions among Flux Dependencies?

thebhatman · 2019-03-07T16:07:04Z

Thank you @staticfloat. I have used the value of lambda and alpha as defined in NNlib.jl. I have removed the redundant dot operations. I will be adding test cases. But before that, I wanted to solve the problem of randn() not giving a standard normal distribution. Can I use Distributions package and use Normal() so that I get a distribution with mean = 0 and variance = 1?

staticfloat · 2019-03-07T17:31:04Z

I realized randn(), which I am using to initialize noise, is not giving a standard normal distribution.

What do you mean by this? Why do you think that randn() is not a standard normal distribution? Note that the Normal distribution from Distributions.jl uses randn() to generate its random numbers internally.

thebhatman · 2019-03-07T17:34:10Z

`julia> x = randn(2,2)
2×2 Array{Float64,2}:
0.332405 -0.813627
0.446137 -0.644151

julia> using Statistics

julia> mean(x)
-0.16980890659280456

julia> std(x)
0.650925294719868

`

thebhatman · 2019-03-07T17:35:56Z

I was just testing randn() and it doesn't seem to be giving a standard normal distribution.

MikeInnes · 2019-03-07T17:42:22Z

That's just sampling error, the mean is only in the limit of a large number of samples.

staticfloat · 2019-03-07T17:42:24Z

This is completely expected; the mean and standard deviation you compute on normally-distributed random numbers will not always be exactly zero and one. That's not the way randomness works. When you take mean(x), you are calculating an estimate of the true mean of the population that your four numbers were taken from.

Let's imagine we are measuring the height of people in a city. Let's further assume that across the entire city, the heights form a normal distribution with a certain mean and standard deviation. If I just take 4 random people from the city, the mean of their heights will probably not be the "true" mean; it will just be an estimate of it. I may be unlucky and grab four people that are all on the very tall side of things. The "error" of my mean estimate may be very high.

You can read here for more information on calculating the error bounds of your statistical estimates, but to convince yourself, try making a matrix that is much larger than 2x2, say, 10000x10000, and see what the statistics of that look like. I guarantee they will be much closer to a mean of 0 and a variance of 1, but they will also not be exactly 0 and 1.

thebhatman · 2019-03-07T17:47:36Z

Oh! That was a nice example. Thank you for clearing my doubt.

staticfloat · 2019-03-07T17:51:41Z

src/layers/normalise.jl

+function (a::AlphaDropout)(x)
+  a.active || return x
+  λ = 1.0507009873554804934193349852946
+  α = 1.6732632423543772848170429916717


I think it would be better to have λ and α be oftype(eltype(x), ....) so that we don't have to convert lower on down; that might help us generate code to run on platforms where, for instance, Float64 literals are not supported (that's not the case on CPUs, but for things like TPUs it's helpful to stay in the proper datatype as much as possible)

thebhatman · 2019-03-07T18:10:08Z

The Travis CI build failed after I resolved the merge conflict, which was raised due to InstanceNorm.

staticfloat · 2019-03-07T18:58:39Z

Thanks @thebhatman this looks pretty good!

MikeInnes · 2019-03-07T19:01:45Z

Awesome, thanks a lot @thebhatman and @staticfloat for the review!

@thebhatman can you also add an entry to NEWS.md about this?

thebhatman added 3 commits March 4, 2019 01:05

Added AlphaDropout which is used in SNNs.

97f874a

Exported AlphaDropout

b5533ee

Updated docs with AlphaDropout

922e9c9

Made sure Gradients are not lost.

29b853e

CarloLucibello reviewed Mar 5, 2019

View reviewed changes

thebhatman force-pushed the patch-3 branch from 655ccfd to d660868 Compare March 5, 2019 05:19

thebhatman added 2 commits March 5, 2019 16:18

Suggested changes made

d660868

Indentation fixed

8e5965a

MikeInnes requested a review from staticfloat March 6, 2019 16:41

staticfloat suggested changes Mar 6, 2019

View reviewed changes

thebhatman force-pushed the patch-3 branch from 0fb1d17 to f4543b7 Compare March 7, 2019 16:22

Merge branch 'master' into patch-3

47c1324

staticfloat reviewed Mar 7, 2019

View reviewed changes

Made lambda and alpha of eltype(x)

c6e51f5

staticfloat merged commit bc12a4d into FluxML:master Mar 7, 2019

thebhatman added 2 commits March 8, 2019 03:21

Value of alpha updated and dot operations changed

f4543b7

Removed {typeof(p)}

1d310d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added AlphaDropout which is used in SNNs. #656

Added AlphaDropout which is used in SNNs. #656

thebhatman commented Mar 3, 2019

MikeInnes commented Mar 4, 2019

thebhatman commented Mar 4, 2019

CarloLucibello Mar 5, 2019

CarloLucibello Mar 5, 2019

thebhatman Mar 5, 2019

CarloLucibello Mar 5, 2019

thebhatman commented Mar 5, 2019

MikeInnes commented Mar 6, 2019

MikeInnes commented Mar 6, 2019

staticfloat left a comment •

edited

Loading

staticfloat Mar 6, 2019

staticfloat Mar 6, 2019

staticfloat Mar 6, 2019

staticfloat Mar 6, 2019

thebhatman commented Mar 7, 2019

thebhatman commented Mar 7, 2019

staticfloat commented Mar 7, 2019

thebhatman commented Mar 7, 2019

thebhatman commented Mar 7, 2019

MikeInnes commented Mar 7, 2019

staticfloat commented Mar 7, 2019

thebhatman commented Mar 7, 2019

staticfloat Mar 7, 2019

thebhatman Mar 7, 2019

thebhatman commented Mar 7, 2019

staticfloat commented Mar 7, 2019

MikeInnes commented Mar 7, 2019

Added AlphaDropout which is used in SNNs. #656

Added AlphaDropout which is used in SNNs. #656

Conversation

thebhatman commented Mar 3, 2019

MikeInnes commented Mar 4, 2019

thebhatman commented Mar 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thebhatman commented Mar 5, 2019

MikeInnes commented Mar 6, 2019

MikeInnes commented Mar 6, 2019

staticfloat left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thebhatman commented Mar 7, 2019

thebhatman commented Mar 7, 2019

staticfloat commented Mar 7, 2019

thebhatman commented Mar 7, 2019

thebhatman commented Mar 7, 2019

MikeInnes commented Mar 7, 2019

staticfloat commented Mar 7, 2019

thebhatman commented Mar 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thebhatman commented Mar 7, 2019

staticfloat commented Mar 7, 2019

MikeInnes commented Mar 7, 2019

staticfloat left a comment •

edited

Loading