-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
noise shape for dropout #563
Conversation
This looks like a good idea, but it also seems like it's equivalent to a |
@MikeInnes Do you mean like using a |
Yes exactly. Though it might be better for |
But if the |
@MikeInnes I made a first version of julia> Dropout(0.5)(randn(5,4), 1)
5×4 Array{Float64,2}:
-0.230536 -0.0 1.02677 -0.903341
-0.605143 0.0 0.748388 0.732854
2.56266 -0.0 -2.79108 -1.59313
-0.613482 -0.0 0.468957 -1.96
-0.87279 -0.0 4.01647 0.647282
julia> Dropout(0.5)(randn(5,4,2), (1,3))
5×4×2 Array{Float64,3}:
[:, :, 1] =
-0.0 -0.0 -1.66134 1.97335
-0.0 0.0 -0.310311 2.57003
0.0 -0.0 1.24803 -3.60845
-0.0 0.0 -1.4593 -0.755723
0.0 0.0 0.8056 4.04177
[:, :, 2] =
-0.0 0.0 -0.532319 -0.836303
0.0 0.0 0.867975 -0.309224
-0.0 -0.0 -2.63861 1.14548
-0.0 0.0 -0.0331286 2.39778
0.0 0.0 -2.47692 -0.358082 |
bump |
src/layers/normalise.jl
Outdated
_dropout_kernel(y::T, p, q) where {T} = y > p ? T(1 / q) : T(0) | ||
|
||
function (a::Dropout)(x) | ||
function (a::Dropout)(x, dims=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer to use dims = :
for all dimensions, like the reduction functions do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. What about the dims
question discuss above? I just though it might be more convenient to use dims
as the broadcasted dims, but maybe it's not and dims
as unbroadcasted dims is more intuitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's more intuitive if it aligns with how dims
is used everywhere else. For example if you wanted to sum across each image you'd likewise do sum(x, dims = (1, 2, 3))
.
It should be a keyword argument, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I change the dims
as the unbroadcasted dims and also make it a keyword argument.
Ok, one last thing, I think the We also need to update the |
Do you mean that we should make |
Yes. |
@MikeInnes where should I add the docs? I can't find the old one in the |
Actually, dropout is part of the docs already so that's fine. Just NEWS.md needs updating. |
Co-Authored-By: Mike J Innes <[email protected]>
bors r+ |
563: noise shape for dropout r=MikeInnes a=chengchingwen I add the noise shape for dropout, similar to the `noise_shape` argument in [`tf.nn.dropout`](https://www.tensorflow.org/api_docs/python/tf/nn/dropout) Co-authored-by: chengchingwen <[email protected]> Co-authored-by: Peter <[email protected]>
Build succeeded |
I add the noise shape for dropout, similar to the
noise_shape
argument intf.nn.dropout