Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Dropout docs a little more user friendly #2014

Merged
merged 1 commit into from
Jul 1, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions src/layers/normalise.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@ The dropout function. If `active` is `true`,
for each input, either sets that input to `0` (with probability
`p`) or scales it by `1 / (1 - p)`. `dims` specifies the unbroadcasted dimensions,
e.g. `dims=1` applies dropout along columns and `dims=2` along rows.
This is used as a regularisation, i.e. it reduces overfitting during training.

If `active` is `false`, it just returns the input `x`.

Specify `rng` for custom RNGs instead of the default RNG.
Expand Down Expand Up @@ -55,11 +53,16 @@ ChainRulesCore.@non_differentiable dropout_mask(::Any, ::Any, ::Any)
"""
Dropout(p; dims=:, rng = rng_from_array())

Dropout layer. In the forward pass, applies the [`Flux.dropout`](@ref) function on the input.
Dropout layer.

While training, for each input, this layer either sets that input to `0` (with probability
`p`) or scales it by `1 / (1 - p)`. To apply dropout along certain dimension(s), specify the
`dims` keyword. e.g. `Dropout(p; dims = 3)` will randomly zero out entire channels on WHCN input
(also called 2D dropout). This is used as a regularisation, i.e. it reduces overfitting during
training.

To apply dropout along certain dimension(s), specify the `dims` keyword.
e.g. `Dropout(p; dims = 3)` will randomly zero out entire channels on WHCN input
(also called 2D dropout).
In the forward pass, this layer applies the [`Flux.dropout`](@ref) function. See that for more
details.

Specify `rng` to use a custom RNG instead of the default.
Custom RNGs are only supported on the CPU.
Expand Down