From 1b7cf9db94530cdd475d50203731ff3eb1bcf5d2 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Mon, 26 Sep 2022 14:37:52 -0400
Subject: [PATCH 1/7] add example to readme

---
 README.md | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 3359918830..074134b8f4 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,13 @@
 <img width="400px" src="https://raw.githubusercontent.com/FluxML/fluxml.github.io/master/logo.png"/>
 </p>
 
-[![][action-img]][action-url] [![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![][codecov-img]][codecov-url]
+<div align="center">
+
+[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602)
+
+[![][action-img]][action-url] [![][codecov-img]][codecov-url]
+
+</div>
 
 [action-img]: https://github.com/FluxML/Flux.jl/workflows/CI/badge.svg
 [action-url]: https://github.com/FluxML/Flux.jl/actions
@@ -12,7 +18,19 @@
 Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.
 
 ```julia
-] add Flux
+using Flux
+
+x = hcat(digits.(0:3, base=2, pad=2)...) |> gpu
+y = Flux.onehotbatch(xor.(eachrow(x)...), 0:1) |> gpu
+data = ((Float32.(x), y) for _ in 1:100)
+
+model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2), softmax) |> gpu
+optim = Adam(0.1, (0.7, 0.95))
+loss(x, y) = Flux.crossentropy(model(x), y)
+
+Flux.train!(loss, Flux.params(model), data, optim)
+
+all((model(x) .> 0.5) .== y)
 ```
 
 See the [documentation](https://fluxml.github.io/Flux.jl/) or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.

From b4900022e1491ed6e32e3e946fd8b8f74106cb17 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Mon, 26 Sep 2022 15:15:58 -0400
Subject: [PATCH 2/7] add installation note, tweak

---
 README.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 074134b8f4..3b0b7af872 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@
 <div align="center">
 
 [![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602)
-
+<br/>
 [![][action-img]][action-url] [![][codecov-img]][codecov-url]
 
 </div>
@@ -17,6 +17,7 @@
 
 Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.
 
+Works best with [Julia 1.8](https://julialang.org/downloads/) or later. This will install everything (including CUDA) and solve the XOR problem:
 ```julia
 using Flux
 
@@ -33,6 +34,6 @@ Flux.train!(loss, Flux.params(model), data, optim)
 all((model(x) .> 0.5) .== y)
 ```
 
-See the [documentation](https://fluxml.github.io/Flux.jl/) or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.
+See the [documentation](https://fluxml.github.io/Flux.jl/) for details, the [website](https://fluxml.ai/tutorials.html) for tutorials, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.
 
 If you use Flux in your research, please [cite](CITATION.bib) our work.

From a22a9044d53f0ef66b4fdb966db929f85a46bbf7 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Tue, 27 Sep 2022 10:22:30 -0400
Subject: [PATCH 3/7] tweak wording, remove tutorials link

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 3b0b7af872..de42e0d140 100644
--- a/README.md
+++ b/README.md
@@ -17,7 +17,7 @@
 
 Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.
 
-Works best with [Julia 1.8](https://julialang.org/downloads/) or later. This will install everything (including CUDA) and solve the XOR problem:
+Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Pasting this in at the Julia prompt will install everything (including CUDA) and solve the XOR problem:
 ```julia
 using Flux
 
@@ -34,6 +34,6 @@ Flux.train!(loss, Flux.params(model), data, optim)
 all((model(x) .> 0.5) .== y)
 ```
 
-See the [documentation](https://fluxml.github.io/Flux.jl/) for details, the [website](https://fluxml.ai/tutorials.html) for tutorials, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.
+See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.
 
 If you use Flux in your research, please [cite](CITATION.bib) our work.

From 37cdcca50d7001db14734fec7f6c6217d149cc8e Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Tue, 27 Sep 2022 10:30:31 -0400
Subject: [PATCH 4/7] try adding comments

---
 README.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index de42e0d140..7b6c5d3e96 100644
--- a/README.md
+++ b/README.md
@@ -17,21 +17,21 @@
 
 Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.
 
-Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Pasting this in at the Julia prompt will install everything (including CUDA) and solve the XOR problem:
+Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Here's a simple example to try it out:
 ```julia
-using Flux
+using Flux  # should install everything for you, including CUDA
 
-x = hcat(digits.(0:3, base=2, pad=2)...) |> gpu
+x = hcat(digits.(0:3, base=2, pad=2)...) |> gpu  # let's solve the XOR problem!
 y = Flux.onehotbatch(xor.(eachrow(x)...), 0:1) |> gpu
-data = ((Float32.(x), y) for _ in 1:100)
+data = ((Float32.(x), y) for _ in 1:100)  # an iterator making Tuples
 
 model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2), softmax) |> gpu
 optim = Adam(0.1, (0.7, 0.95))
 loss(x, y) = Flux.crossentropy(model(x), y)
 
-Flux.train!(loss, Flux.params(model), data, optim)
+Flux.train!(loss, Flux.params(model), data, optim)  # updates model & optim
 
-all((model(x) .> 0.5) .== y)
+all((model(x) .> 0.5) .== y)  # usually 100% accuracy.
 ```
 
 See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.

From 66895131deac2361d531efcb9782849f5b2132cf Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Sun, 2 Oct 2022 07:17:53 -0400
Subject: [PATCH 5/7] mv slack to text, add downloads badge

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 7b6c5d3e96..2c3ec8f8ec 100644
--- a/README.md
+++ b/README.md
@@ -4,9 +4,9 @@
 
 <div align="center">
 
-[![](https://img.shields.io/badge/docs-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![](https://img.shields.io/badge/chat-on%20slack-yellow.svg)](https://julialang.org/slack/) [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602)
+[![](https://img.shields.io/badge/Documentation-stable-blue.svg)](https://fluxml.github.io/Flux.jl/stable/) [![DOI](https://joss.theoj.org/papers/10.21105/joss.00602/status.svg)](https://doi.org/10.21105/joss.00602) [![Flux Downloads](https://shields.io/endpoint?url=https://pkgs.genieframework.com/api/v1/badge/Flux)](https://pkgs.genieframework.com?packages=Flux)
 <br/>
-[![][action-img]][action-url] [![][codecov-img]][codecov-url]
+[![][action-img]][action-url] [![][codecov-img]][codecov-url] [![ColPrac: Contributor's Guide on Collaborative Practices for Community Packages](https://img.shields.io/badge/ColPrac-Contributor's%20Guide-blueviolet)](https://github.com/SciML/ColPrac)
 
 </div>
 
@@ -34,6 +34,6 @@ Flux.train!(loss, Flux.params(model), data, optim)  # updates model & optim
 all((model(x) .> 0.5) .== y)  # usually 100% accuracy.
 ```
 
-See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples.
+See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples. Ask questions on the [Julia discourse](https://discourse.julialang.org/) or [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866).
 
 If you use Flux in your research, please [cite](CITATION.bib) our work.

From 1a42167128ec01662131b9924a85a3e96593035b Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Thu, 6 Oct 2022 14:29:19 -0400
Subject: [PATCH 6/7] change to use logitcrossentropy

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 2c3ec8f8ec..284b7a1d6d 100644
--- a/README.md
+++ b/README.md
@@ -25,13 +25,13 @@ x = hcat(digits.(0:3, base=2, pad=2)...) |> gpu  # let's solve the XOR problem!
 y = Flux.onehotbatch(xor.(eachrow(x)...), 0:1) |> gpu
 data = ((Float32.(x), y) for _ in 1:100)  # an iterator making Tuples
 
-model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2), softmax) |> gpu
+model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2)) |> gpu
 optim = Adam(0.1, (0.7, 0.95))
-loss(x, y) = Flux.crossentropy(model(x), y)
+loss(x, y) = Flux.logitcrossentropy(model(x), y)
 
 Flux.train!(loss, Flux.params(model), data, optim)  # updates model & optim
 
-all((model(x) .> 0.5) .== y)  # usually 100% accuracy.
+all((softmax(model(x)) .> 0.5) .== y)  # usually 100% accuracy.
 ```
 
 See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples. Ask questions on the [Julia discourse](https://discourse.julialang.org/) or [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866).

From e1ca78b61dd1103b1972b403e45b064fe834a4b4 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Fri, 7 Oct 2022 09:58:29 -0400
Subject: [PATCH 7/7] mention that loss(x,y) closes over model

this is a slightly weird feature of our API... and since we also call logitcrossentropy a loss function, perhaps we should emphasize that loss(x,y) is a new thing just for this model, not a function for all time.
---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 284b7a1d6d..439919f068 100644
--- a/README.md
+++ b/README.md
@@ -27,9 +27,9 @@ data = ((Float32.(x), y) for _ in 1:100)  # an iterator making Tuples
 
 model = Chain(Dense(2 => 3, sigmoid), BatchNorm(3), Dense(3 => 2)) |> gpu
 optim = Adam(0.1, (0.7, 0.95))
-loss(x, y) = Flux.logitcrossentropy(model(x), y)
+mloss(x, y) = Flux.logitcrossentropy(model(x), y)  # closes over model
 
-Flux.train!(loss, Flux.params(model), data, optim)  # updates model & optim
+Flux.train!(mloss, Flux.params(model), data, optim)  # updates model & optim
 
 all((softmax(model(x)) .> 0.5) .== y)  # usually 100% accuracy.
 ```