Skip to content

Commit

Permalink
Merge pull request #48 from avik-pal/ap/relax
Browse files Browse the repository at this point in the history
Formatting updates and relax parameter type
  • Loading branch information
avik-pal authored Jun 11, 2022
2 parents 95d27d0 + 195041b commit 72a39e7
Show file tree
Hide file tree
Showing 36 changed files with 673 additions and 545 deletions.
7 changes: 7 additions & 0 deletions .JuliaFormatter.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,9 @@
style = "sciml"
whitespace_in_kwargs = false
always_use_return = true
margin = 92
indent = 4
format_docstrings = true
join_lines_based_on_source = true
separate_kwargs_with_semicolon = true
always_for_in = true
18 changes: 11 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
# v0.4

## v0.4.5

- Allow Arbitrary Parameter Types

## v0.4.4

* Updated to support julia v1.6 (test time dependency issues)
- Updated to support julia v1.6 (test time dependency issues)

## v0.4.3

* Extending Scale to allow for multiple dimension inputs (https://github.com/avik-pal/Lux.jl/pull/40)
- Extending Scale to allow for multiple dimension inputs (https://github.com/avik-pal/Lux.jl/pull/40)

## v0.4.2

* `SelectDim` is no longer type unstable -- Internal storage for the Layer has been changed
* `Dropout` & `VariationalDropout` return `NoOpLayer` if the probability of dropout is `0`
* Code Formatting -- SciMLStyle (https://github.com/avik-pal/Lux.jl/pull/31)
- `SelectDim` is no longer type unstable -- Internal storage for the Layer has been changed
- `Dropout` & `VariationalDropout` return `NoOpLayer` if the probability of dropout is `0`
- Code Formatting -- SciMLStyle (https://github.com/avik-pal/Lux.jl/pull/31)

## v0.4.1

* Fix math rendering in docs
* Add Setfield compat for v1.0
- Fix math rendering in docs
- Add Setfield compat for v1.0
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Lux"
uuid = "b2108857-7c20-44ae-9111-449ecde12c47"
authors = ["Avik Pal <[email protected]> and contributors"]
version = "0.4.4"
version = "0.4.5"

[deps]
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
Expand Down
17 changes: 4 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Lux 🔥

[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](http://lux.csail.mit.edu/dev/) [![Stable Docs](https://img.shields.io/badge/docs-stable-blue.svg)](http://lux.csail.mit.edu/stable/) [![CI](https://github.com/avik-pal/Lux.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/avik-pal/Lux.jl/actions/workflows/CI.yml) [![codecov](https://codecov.io/gh/avik-pal/Lux.jl/branch/main/graph/badge.svg?token=IMqBM1e3hz)](https://codecov.io/gh/avik-pal/Lux.jl) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](http://lux.csail.mit.edu/dev/) [![Stable Docs](https://img.shields.io/badge/docs-stable-blue.svg)](http://lux.csail.mit.edu/stable/) [![CI](https://github.com/avik-pal/Lux.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/avik-pal/Lux.jl/actions/workflows/CI.yml) [![codecov](https://codecov.io/gh/avik-pal/Lux.jl/branch/main/graph/badge.svg?token=IMqBM1e3hz)](https://codecov.io/gh/avik-pal/Lux.jl) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![SciML Code Style](https://img.shields.io/static/v1?label=code%20style&message=SciML&color=9558b2&labelColor=389826)](https://github.com/SciML/SciMLStyle)


The 🔥 Deep Learning Framework
Expand All @@ -21,15 +21,8 @@ rng = Random.default_rng()
Random.seed!(rng, 0)

# Construct the layer
model = Chain(
BatchNorm(128),
Dense(128, 256, tanh),
BatchNorm(256),
Chain(
Dense(256, 1, tanh),
Dense(1, 10)
)
)
model = Chain(BatchNorm(128), Dense(128, 256, tanh), BatchNorm(256),
Chain(Dense(256, 1, tanh),Dense(1, 10)))

# Parameter and State Variables
ps, st = Lux.setup(rng, model) .|> gpu
Expand All @@ -54,9 +47,7 @@ Look in the [examples](/examples/) directory for self-contained usage examples.

## Ecosystem

### Prebuilt Deep Learning Models

See [Boltz](lib/Boltz/) for pre-built deep learning models with pretrained weights for popular datasets.
Checkout our [Ecosystem](http://lux.csail.mit.edu/dev/introduction/ecosystem/) page for more details.

## Getting Help

Expand Down
5 changes: 2 additions & 3 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,8 @@ makedocs(;
"Utilities" => "api/utilities.md",
],
"Design Docs" => [
"Documentation" => "design/documentation.md",
"Recurrent Neural Networks" => "design/recurrent.md",
"Add new functionality to Lux" => "design/core.md",
"Contribution Guide" => "design/contributing.md",
"Layer Implementation" => "design/layer_implementation.md",
],
])

Expand Down
40 changes: 40 additions & 0 deletions docs/src/design/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Contribution Guidelines

## Adding New Functionality/Layers

For Style we try to follow [SciMLStyle](https://github.com/SciML/SciMLStyle). The only reason we don't have a badge yet, is we haven't yet updated the package to followed all the guidelines. Here, I am documenting some additional guidelines we enforce:

### Mutability

See [SciMLStyle](https://github.com/SciML/SciMLStyle#out-of-place-and-immutability-is-preferred-when-sufficient-performant) for reference. This is strictly enforced, i.e. all layers/functions provided as part of the external API must be pure functions, even if they come with a performance penalty.

### Branching -- Generated Functions

Zygote doesn't like branches in code. Like it or not, we are stuck with it for the near future. Even if julia is able to optimize branches away, Zygote will most certainly throw away those optimizations (these can be tested via `Zygote.@code_ir`).

#### Writing efficient non-branching code to make Zygote happy

* Rely on `@generated` functions to remove **most** runtime branching. Certain examples:
* Layers behaving differently during training and inference -- we know at compile-time whether a layer is being run in training/inference mode via `istraining(st)`.
* Composite Layers relying on a variable number of internal layers -- Again we know the length of the number of internal layers at compile time. Hence we can manually unroll the loops. See [`Parallel`](@ref), [`Chain`](@ref), etc.
* Pass around `Val` in state. `Flux.jl` sets `training` to be `(:auto, true, false)`. Hence, which branch will be evaluated, will have to be determined at runtime time (*bad*). Instead if we pass `Val(true)`, we will be able to specialize functions directly based on `true`, `false`, etc. ensuring there is no runtime cost for these operations. See [`BatchNorm`](@ref), [`Dropout`](@ref), etc.


## Guide to Documentation for Lux.jl

### Documentation for Layers

The first line must be indented by 4 spaces and should contain the possible ways to construct the layer. This should be followed up with a description about what the layer does. If mathematical equations are needed to explain what the layer does, go for it. Often times we fuse parameters to make computation faster, this should be reflected in the equations being used, i.e. equations and the internal code must be consistent. (See [`LSTMCell`](@ref), [`GRUCell`](@ref) for some examples)

!!! note
There is no need to document how the layers are being called since they **must** adhere to `layer(x, ps, st)`. Any deviation from that and the PR will not be accepted.

Next, we will have certain subsections (though all of them might not be necessary for all layers)

* **Arguments**: This section should be present unless the layer is constructed without any arguments (See [`NoOpLayer`](@ref)). All the arguments and their explicit constraints must be explained.
* It is recommended to separate out the Keyword Arguments in their own section
* **Inputs**: This section should always be present. List out the requirements `x` needs to satisfy. (don't write about `ps` and `st` since that is expected by default)
* **Returns**: What will the layer return? We know the second element will be a state but is that updated in any form or not?
* **Parameters**: What are the properties of the NamedTuple returned from `initialparameters`? Omit if the layer is parameterless
* **States**: What are the properties of the NamedTuple returned from `initialstates`? Omit if the layer is stateless

19 changes: 0 additions & 19 deletions docs/src/design/core.md

This file was deleted.

17 changes: 0 additions & 17 deletions docs/src/design/documentation.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,42 +1,44 @@
# Recurrent Neural Networks
# Layer Implementation

## Cell Implementations
## Recurrent Neural Networks

### Explicit Management on End-User Side
### Cell Implementations

#### Explicit Management on End-User Side

!!! note
We currently use this implementation

User is responsible for managing the memory and hidden states.

#### Pros
##### Pros

1. Simple Design and Implementation
2. Hard for the User to mess up, i.e. there is no explicit requirement to call things like `Flux.reset!`
* In the first call user passes the `input`
* In the subsequent calls, the user passes a tuple containing the `input`, `hidden_state` and `memory` (if needed)

#### Cons
##### Cons

1. Requires more explicit management from the user which might make it harder to use.
2. Currently the call order convention is not enforced which could lead to sneaky errors. (Implementing a check is quite trivial if we store a call counter in the model `state`)


### Store Hidden State and Memory in Model State
#### Store Hidden State and Memory in Model State

Storing the memory and hidden state in `st` would allow user to just pass `x` without varying how calls are made at different timesteps

#### Pros
##### Pros

1. Easier for the end-user

#### Cons
##### Cons

1. `reset`ing the hidden-state and memory is slightly tricky.
1. One way would be to store a `initial_hidden_state` and `initial_memory` in the state alongside the `hidden_state` and `memory`


## RNN Blocks
### RNN Blocks

!!! note
This is currently unimplemented
Expand Down
9 changes: 5 additions & 4 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
!!! warning
These were not written in the form of tutorials but standalone scripts/packages for people to use

## Packages

* [Deep Equilibrium Models](https://github.com/SciML/FastDEQ.jl)

## Scipts

* [ImageNet Classification using Metalhead.jl Models](https://github.com/avik-pal/Lux.jl/tree/main/examples/ImageNet)


## Packages

See [Ecosystem](introduction/ecosystem.md) for more details
47 changes: 32 additions & 15 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# Lux
# Introduction

Welcome to the documentation of Lux!

# What is Lux?

`Lux` is a julia deep learning framework which decouples models and parameterization using deeply nested named tuples.

- Functional Layer API -- Pure Functions and Deterministic Function Calls.
- No more implicit parameterization -- `Zygote.Params`. Everything is a `NamedTuple`.
- Functional Design -- Pure Functions and Deterministic Function Calls.
- No more implicit parameterization.
- Compiler and AD-friendly Neural Networks

# Installation
# Installation Guide

Install [julia v1.6 or above](https://julialang.org/downloads/).

Expand All @@ -15,7 +19,16 @@ using Pkg
Pkg.add("Lux")
```

# Quick Example
# Resources to Get Started

* Go through the [Quickstart Example](#quickstart).
* Read the introductory tutorials on [julia](https://jump.dev/JuMP.jl/stable/tutorials/getting_started/getting_started_with_julia/#Getting-started-with-Julia) and [Lux](introduction/overview.md)
* Go through the examples sorted based on their complexity in the documentation

!!! tip
For usage related questions, please use [Github Discussions](https://github.com/avik-pal/Lux.jl/discussions) or [JuliaLang Discourse (machine learning domain)](https://discourse.julialang.org/c/domain/ml/) which allows questions and answers to be indexed. To report bugs use [github issues](https://github.com/avik-pal/Lux.jl/issues) or even better send in a [pull request](https://github.com/avik-pal/Lux.jl/pulls).

# Quickstart

```julia
using Lux, Random, Optimisers, Zygote
Expand All @@ -33,15 +46,8 @@ Build the model

```julia
# Construct the layer
model = Chain(
BatchNorm(128),
Dense(128, 256, tanh),
BatchNorm(256),
Chain(
Dense(256, 1, tanh),
Dense(1, 10)
)
)
model = Chain(BatchNorm(128), Dense(128, 256, tanh), BatchNorm(256),
Chain(Dense(256, 1, tanh), Dense(1, 10)))
```

Models don't hold parameters and states so initialize them. From there on, we just use our standard AD and Optimisers API.
Expand All @@ -57,13 +63,24 @@ x = rand(rng, Float32, 128, 2) |> gpu
y, st = Lux.apply(model, x, ps, st)

# Gradients
gs = gradient(p -> sum(Lux.apply(model, x, p, st)[1]), ps)[1]
## Pullback API to capture change in state
(l, st_), pb = pullback(p -> Lux.apply(model, x, p, st), ps)
gs = pb((one.(l), nothing))

# Optimization
st_opt = Optimisers.setup(Optimisers.ADAM(0.0001), ps)
st_opt, ps = Optimisers.update(st_opt, ps, gs)
```

# How the documentation is structured

Having a high-level overview of how this documentation is structured will help you know where to look for certain things.

* `Introduction` -- Talks about why we wrote Lux and has pointers to frameworks in the extended julia ecosystem which might help users to get started with deep learning
* `Examples` -- Contain tutorials of varying complexity. These contain worked examples of solving problems with Lux. Start here if you are new to Lux, or you have a particular problem class you want to model.
* `API` -- Contains a complete list of the functions you can use in Lux. Look here if you want to know how to use a particular function.
* `Design Docs` -- Contains information for people contributing to Lux development or writing Lux extensions. Don't worry about this section if you are using Lux to formulate and solve problems as a user.

# Citation

If you found this library to be useful in academic work, then please cite:
Expand Down
Loading

0 comments on commit 72a39e7

Please sign in to comment.