Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noise shape for dropout #563

Merged
merged 8 commits into from
May 13, 2019
Merged

noise shape for dropout #563

merged 8 commits into from
May 13, 2019

Conversation

chengchingwen
Copy link
Member

I add the noise shape for dropout, similar to the noise_shape argument in tf.nn.dropout

@MikeInnes
Copy link
Member

This looks like a good idea, but it also seems like it's equivalent to a dims keyword argument, which would be more standard in Julia. Any thoughts on doing that API instead?

@chengchingwen
Copy link
Member Author

chengchingwen commented Mar 11, 2019

@MikeInnes Do you mean like using a dims keyword argument for indicating which dimensions should be broadcasted?

@MikeInnes
Copy link
Member

Yes exactly. Though it might be better for dims to say which dims are not broadcasted, so that e.g. dims=1 does dropout along columns and dims=2 along rows.

@chengchingwen
Copy link
Member Author

But if the dims is which not broadcasted, then what about higher dimensional input? Like if the input data is with shape of (C, W, H, B) and I want each sample to have the same dropout, than I would need to specified dims=(1,2,3) for the unbroadcasted C, W, H?

@chengchingwen
Copy link
Member Author

@MikeInnes I made a first version of dims API, and currently the dims is the broadcasted dims. For example

julia> Dropout(0.5)(randn(5,4), 1)
5×4 Array{Float64,2}:
 -0.230536  -0.0   1.02677   -0.903341
 -0.605143   0.0   0.748388   0.732854
  2.56266   -0.0  -2.79108   -1.59313 
 -0.613482  -0.0   0.468957  -1.96    
 -0.87279   -0.0   4.01647    0.647282

1 indicate that each row use the same dropout result and dims can also be a tuple for multiple dimensions.

julia> Dropout(0.5)(randn(5,4,2), (1,3))
5×4×2 Array{Float64,3}:
[:, :, 1] =
 -0.0  -0.0  -1.66134    1.97335 
 -0.0   0.0  -0.310311   2.57003 
  0.0  -0.0   1.24803   -3.60845 
 -0.0   0.0  -1.4593    -0.755723
  0.0   0.0   0.8056     4.04177 

[:, :, 2] =
 -0.0   0.0  -0.532319   -0.836303
  0.0   0.0   0.867975   -0.309224
 -0.0  -0.0  -2.63861     1.14548 
 -0.0   0.0  -0.0331286   2.39778 
  0.0   0.0  -2.47692    -0.358082

@chengchingwen
Copy link
Member Author

bump

@chengchingwen
Copy link
Member Author

@MikeInnes

_dropout_kernel(y::T, p, q) where {T} = y > p ? T(1 / q) : T(0)

function (a::Dropout)(x)
function (a::Dropout)(x, dims=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer to use dims = : for all dimensions, like the reduction functions do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. What about the dims question discuss above? I just though it might be more convenient to use dims as the broadcasted dims, but maybe it's not and dims as unbroadcasted dims is more intuitive?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's more intuitive if it aligns with how dims is used everywhere else. For example if you wanted to sum across each image you'd likewise do sum(x, dims = (1, 2, 3)).

It should be a keyword argument, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I change the dims as the unbroadcasted dims and also make it a keyword argument.

@MikeInnes
Copy link
Member

Ok, one last thing, I think the dims argument should be an argument to Dropout(x), not an argument when you call that layer. Actually, it'd be good if you could split out a dropout function here and use that in the Dropout layer.

We also need to update the Dropout docs and add a news item; but once that's done this will be good to go!

@chengchingwen
Copy link
Member Author

Do you mean that we should make dims a field of the Dropout layer?

@MikeInnes
Copy link
Member

Yes.

@FluxML FluxML deleted a comment from chengchingwen May 10, 2019
@FluxML FluxML deleted a comment from chengchingwen May 10, 2019
@chengchingwen
Copy link
Member Author

@MikeInnes where should I add the docs? I can't find the old one in the docs folder

@MikeInnes
Copy link
Member

Actually, dropout is part of the docs already so that's fine. Just NEWS.md needs updating.

NEWS.md Outdated Show resolved Hide resolved
Co-Authored-By: Mike J Innes <[email protected]>
@MikeInnes
Copy link
Member

bors r+

bors bot added a commit that referenced this pull request May 13, 2019
563: noise shape for dropout r=MikeInnes a=chengchingwen

I add the noise shape for dropout, similar to the `noise_shape` argument in [`tf.nn.dropout`](https://www.tensorflow.org/api_docs/python/tf/nn/dropout)

Co-authored-by: chengchingwen <[email protected]>
Co-authored-by: Peter <[email protected]>
@bors
Copy link
Contributor

bors bot commented May 13, 2019

Build succeeded

@bors bors bot merged commit 9c1bb93 into FluxML:master May 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants