Fix docs

LuxDL · Apr 18, 2023 · 29cba88 · 29cba88
1 parent 60833a7
commit 29cba88
Show file tree

Hide file tree

Showing 9 changed files with 24 additions and 27 deletions.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -3,8 +3,6 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 DocumenterMarkdown = "997ab1e6-3595-5248-9280-8efb232c3433"
 Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
 Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
-LuxCore = "bb33d45b-7691-41d6-9220-0943567d0623"
 Lux = "b2108857-7c20-44ae-9111-449ecde12c47"
-LuxLib = "82251201-b29d-42c6-8e01-566dec8acb11"
 Optimisers = "3bd65402-5787-11e9-1adc-39752487f4e2"
 Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
diff --git a/docs/make.jl b/docs/make.jl
@@ -1,12 +1,12 @@
-using Documenter, DocumenterMarkdown, LuxCore, Lux, LuxLib, Pkg
+using Documenter, DocumenterMarkdown, Lux, Pkg
 
 import Flux  # Load weak dependencies
 
 deployconfig = Documenter.auto_detect_deploy_system()
 Documenter.post_status(deployconfig; type="pending", repo="github.com/avik-pal/Lux.jl.git")
 
 makedocs(; sitename="Lux", authors="Avik Pal et al.", clean=true, doctest=true,
-         modules=[Lux, LuxLib, LuxCore],
+         modules=[Lux],
          strict=[:doctest, :linkcheck, :parse_error, :example_block, :missing_docs],
          checkdocs=:all, format=Markdown(), draft=false, build=joinpath(@__DIR__, "docs"))
 

diff --git a/docs/src/manual/interface.md b/docs/src/manual/interface.md
@@ -24,7 +24,7 @@ First let's set the expectations straight.
 ### Singular Layer
 
 If the layer doesn't contain any other Lux layer, then it is a `Singular Layer`. This means
-it should optionally subtype [`Lux.AbstractExplicitLayer`](@ref) but mandatorily define
+it should optionally subtype `Lux.AbstractExplicitLayer` but mandatorily define
 all the necessary functions mentioned in the docstrings. Consider a simplified version of
 [`Dense`](@ref) called `Linear`.
 
@@ -70,8 +70,8 @@ end
 Lux.initialstates(::AbstractRNG, ::Linear) = NamedTuple()
 ```
 
-You could also implement [`Lux.parameterlength`](@ref) and [`Lux.statelength`](@ref) to
-prevent wasteful reconstruction of the parameters and states.
+You could also implement `Lux.parameterlength` and `Lux.statelength` to prevent wasteful
+reconstruction of the parameters and states.
 
 ```@example layer_interface
 # This works

diff --git a/docs/src/manual/migrate_from_flux.md b/docs/src/manual/migrate_from_flux.md
@@ -48,12 +48,11 @@ should be implemented. A summary of the differences would be:
 * Flux stores everything in a single struct and relies on `Functors.@functor` and
   `Flux.trainable` to distinguish between trainable and non-trainable parameters.
 
-* Lux relies on the user to define [`Lux.initialparameters`](@ref) and
-  [`Lux.initialstates`](@ref) to distinguish between trainable parameters (called
-  "parameters") and non-trainable parameters (called "states"). Additionally Lux layers
-  define the model architecture, hence device transfer utilities like [`gpu`](@ref),
-  [`cpu`](@ref), etc. cannot be applied on Lux layers, instead they need to be applied on
-  the parameters and states.
+* Lux relies on the user to define `Lux.initialparameters` and `Lux.initialstates` to
+  distinguish between trainable parameters (called "parameters") and non-trainable
+  parameters (called "states"). Additionally Lux layers define the model architecture, hence
+  device transfer utilities like [`gpu`](@ref), [`cpu`](@ref), etc. cannot be applied on Lux
+  layers, instead they need to be applied on the parameters and states.
 
 Let's work through a concrete example to demonstrate this. We will implement a very simple
 layer that computes ``A \times B \times x`` where ``A`` is not trainable and ``B`` is
@@ -156,7 +155,7 @@ Flux supports a mode called `:auto` which automatically decides if the user is t
 model or running inference. This is the default mode for `Flux.BatchNorm`, `Flux.GroupNorm`,
 `Flux.Dropout`, etc. Lux doesn't support this mode (specifically to keep code simple and
 do exactly what the user wants), hence our default mode is `training`. This can be changed
-using [`Lux.testmode`](@ref).
+using `Lux.testmode`.
 
 ## Can't access functions like `relu`, `sigmoid`, etc?
 

diff --git a/examples/SimpleRNN/main.jl b/examples/SimpleRNN/main.jl
@@ -47,7 +47,7 @@ end
 
 # We pass the fieldnames `lstm_cell` and `classifier` to the type to ensure that the
 # parameters and states are automatically populated and we don't have to define
-# [`Lux.initialparameters`](@ref) and [`Lux.initialstates`](@ref).
+# `Lux.initialparameters` and `Lux.initialstates`.
 
 # To understand more about container layers, please look at
 # [Container Layer](http://lux.csail.mit.edu/stable/manual/interface/#container-layer).

diff --git a/src/layers/containers.jl b/src/layers/containers.jl
@@ -253,7 +253,7 @@ x1 → layer1 → y1 ↘
 
   - `connection`: Takes 2 inputs and combines them
 
-  - `layers`: [`AbstractExplicitLayer`](@ref)s. Layers can be specified in two formats:
+  - `layers`: `AbstractExplicitLayer`s. Layers can be specified in two formats:
 
       + A list of `N` Lux layers
       + Specified as `N` keyword arguments.

diff --git a/src/layers/conv.jl b/src/layers/conv.jl
@@ -505,7 +505,7 @@ end
 Pixel shuffling layer with upscale factor `r`. Usually used for generating higher
 resolution images while upscaling them.
 
-See [`NNlib.pixel_shuffle`](@ref) for more details.
+See `NNlib.pixel_shuffle` for more details.
 
 PixelShuffle is not a Layer, rather it returns a [`WrappedFunction`](@ref) with the
 function set to `Base.Fix2(pixel_shuffle, r)`

diff --git a/src/layers/normalize.jl b/src/layers/normalize.jl
@@ -67,7 +67,7 @@ slice and normalises the input accordingly.
       + `running_var`: nothing
   - `training`: Used to check if training/inference mode
 
-Use [`Lux.testmode`](@ref) during inference.
+Use `Lux.testmode` during inference.
 
 ## Example
 
@@ -209,7 +209,7 @@ end
       + `running_var`: nothing
   - `training`: Used to check if training/inference mode
 
-Use [`Lux.testmode`](@ref) during inference.
+Use `Lux.testmode` during inference.
 
 ## Example
 
@@ -358,7 +358,7 @@ accordingly.
 
   - `training`: Used to check if training/inference mode
 
-Use [`Lux.testmode`](@ref) during inference.
+Use `Lux.testmode` during inference.
 
 ## Example
 

diff --git a/src/layers/recurrent.jl b/src/layers/recurrent.jl
@@ -68,7 +68,7 @@ end
 _generate_init_recurrence(out, carry, st) = (typeof(out)[out], carry, st)
 ∇_generate_init_recurrence((Δout, Δcarry, Δst)) = (first(Δout), Δcarry, Δst)
 
-function (r::Recurrence{true})(x::Union{AbstractVector, NTuple}, ps, st::NamedTuple)
+@views function (r::Recurrence{true})(x::Union{AbstractVector, NTuple}, ps, st::NamedTuple)
     (out_, carry), st = Lux.apply(r.cell, first(x), ps, st)
 
     init = _generate_init_recurrence(out_, carry, st)
@@ -524,15 +524,15 @@ Gated Recurrent Unit (GRU) Cell
 ## Parameters
 
   - `weight_i`: Concatenated Weights to map from input space
-                ``\\left\\{ W_{ir}, W_{iz}, W_{in} \\right\\}``.
+                ``\\left\\\{ W_{ir}, W_{iz}, W_{in} \\right\\\}``.
   - `weight_h`: Concatenated Weights to map from hidden space
-                ``\\left\\{ W_{hr}, W_{hz}, W_{hn} \\right\\}``
-  - `bias_i`: Bias vector (``b_{in}``; not present if `use_bias=false`)
+                ``\\left\\\{ W_{hr}, W_{hz}, W_{hn} \\right\\\}``.
+  - `bias_i`: Bias vector (``b_{in}``; not present if `use_bias=false`).
   - `bias_h`: Concatenated Bias vector for the hidden space
-              ``\\left\\{ b_{hr}, b_{hz}, b_{hn} \\right\\}`` (not present if
-              `use_bias=false`)
+              ``\\left\\\{ b_{hr}, b_{hz}, b_{hn} \\right\\\}`` (not present if
+              `use_bias=false`).
   - `hidden_state`: Initial hidden state vector (not present if `train_state=false`)
-              ``\\left\\{ b_{hr}, b_{hz}, b_{hn} \\right\\}``
+              ``\\left\\\{ b_{hr}, b_{hz}, b_{hn} \\right\\\}``.
 
 ## States