documentation

FluxML · Oct 7, 2022 · 8cf6d0e · 8cf6d0e
1 parent 537b011
commit 8cf6d0e
Showing 1 changed file with 44 additions and 27 deletions.
diff --git a/docs/src/outputsize.md b/docs/src/outputsize.md
@@ -1,47 +1,64 @@
 # Shape Inference
 
-To help you generate models in an automated fashion, [`Flux.outputsize`](@ref) lets you 
-calculate the size returned produced by layers for a given size input.
-This is especially useful for layers like [`Conv`](@ref).
+Flux has some tools to help generate models in an automated fashion, by inferring the size
+of arrays that layers will recieve, without doing any computation. 
+This is especially useful for convolutional models, where the same [`Conv`](@ref) layer
+accepts any size of image, but the next layer may not. 
 
-It works by passing a "dummy" array into the model that preserves size information without running any computation.
-`outputsize(f, inputsize)` works for all layers (including custom layers) out of the box.
-By default, `inputsize` expects the batch dimension,
-but you can exclude the batch size with `outputsize(f, inputsize; padbatch=true)` (assuming it to be one).
+The higher-level one is a macro [`@autosize`](@ref) which acts on the code defining the layers,
+and replaces each appearance of `_` with the relevant size. A simple example might be:
 
-Using this utility function lets you automate model building for various inputs like so:
 ```julia
-"""
-    make_model(width, height, inchannels, nclasses;
-               layer_config = [16, 16, 32, 32, 64, 64])
+@autosize (28, 28, 1, 32) Chain(Conv((3, 3), _ => 5, relu, stride=2), Flux.flatten, Dense(_ => 10))
+```
+
+The size may be provided at runtime, like `@autosize (sz..., 1, 32) Chain(Conv(`..., but the
+layer constructors must be explicitly written out -- the macro sees the code as written.
+
+This relies on a lower-level function [`outputsize`](@ref Flux.outputsize), which you can also use directly:
+
+```julia
+c = Conv((3, 3), 1 => 5, relu, stride=2)
+Flux.outputsize(c, (28, 28, 1, 32))  # returns (13, 13, 5, 32)
+```
+
+The function `outputsize` works by passing a "dummy" array into the model, which propagates through very cheaply.
+It should work for all layers, including custom layers, out of the box.
 
-Create a CNN for a given set of configuration parameters.
+An example of how to automate model building is this:
+```julia
+"""
+    make_model(width, height, [inchannels, nclasses; layer_config])
 
-# Arguments
+Create a CNN for a given set of configuration parameters. Arguments:
 - `width`: the input image width
 - `height`: the input image height
-- `inchannels`: the number of channels in the input image
-- `nclasses`: the number of output classes
-- `layer_config`: a vector of the number of filters per each conv layer
+- `inchannels`: the number of channels in the input image, default 1
+- `nclasses`: the number of output classes, default 10
+- `layer_config`: a vector of the number of filters per layer, default `[16, 16, 32, 64]`
 """
-function make_model(width, height, inchannels, nclasses;
-                    layer_config = [16, 16, 32, 32, 64, 64])
-  # construct a vector of conv layers programmatically
-  conv_layers = [Conv((3, 3), inchannels => layer_config[1])]
+function make_model(width, height, inchannels = 1, nclasses = 10;
+                    layer_config = [16, 16, 32, 64])
+  # construct a vector of conv layers:
+  conv_layers = Any[Conv((3, 3), inchannels => layer_config[1], relu)]
   for (infilters, outfilters) in zip(layer_config, layer_config[2:end])
-    push!(conv_layers, Conv((3, 3), infilters => outfilters))
+    push!(conv_layers, Conv((3, 3), infilters => outfilters, relu, stride=2))
   end
 
-  # compute the output dimensions for the conv layers
-  # use padbatch=true to set the batch dimension to 1
-  conv_outsize = Flux.outputsize(conv_layers, (width, height, nchannels); padbatch=true)
+  # compute the output dimensions after these conv layers:
+  conv_outsize = Flux.outputsize(conv_layers, (width, height, inchannels); padbatch=true)
 
-  # the input dimension to Dense is programatically calculated from
-  #  width, height, and nchannels
-  return Chain(conv_layers..., Dense(prod(conv_outsize) => nclasses))
+  # use this to define appropriate Dense layer:
+  last_layer = Dense(prod(conv_outsize) => nclasses)
+  return Chain(conv_layers..., last_layer)
 end
+
+make_model(28, 28, 3)
 ```
 
+### Listing
+
 ```@docs
+Flux.@autosize
 Flux.outputsize
 ```