From 098539985decdd566970ecbb340912d7281327a2 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Wed, 12 Jun 2024 20:32:26 +0000 Subject: [PATCH] build based on 3427fcb --- dev/.documenter-siteinfo.json | 2 +- dev/about/index.html | 2 +- .../architecture_search/README/index.html | 2 +- .../architecture_search/notebook.ipynb | 10 +- .../architecture_search/notebook.jl | 2 +- .../notebook.unexecuted.ipynb | 2 +- .../architecture_search/notebook/index.html | 6 +- .../comparison/README/index.html | 2 +- .../comparison/notebook.ipynb | 8 +- dev/common_workflows/comparison/notebook.jl | 2 +- .../comparison/notebook.unexecuted.ipynb | 2 +- .../comparison/notebook/index.html | 4 +- .../composition/README/index.html | 2 +- .../composition/notebook.ipynb | 18 +- dev/common_workflows/composition/notebook.jl | 2 +- .../composition/notebook.unexecuted.ipynb | 4 +- .../composition/notebook/index.html | 6 +- .../early_stopping/README/index.html | 2 +- .../early_stopping/notebook.ipynb | 148 +++--- .../early_stopping/notebook.jl | 4 +- .../early_stopping/notebook.unexecuted.ipynb | 4 +- .../notebook/{f7d84d07.svg => f88f6e9e.svg} | 68 +-- .../early_stopping/notebook/index.html | 10 +- .../hyperparameter_tuning/README/index.html | 2 +- .../hyperparameter_tuning/notebook.ipynb | 444 ++++++++++++++++++ .../hyperparameter_tuning/notebook.jl | 2 +- .../notebook.unexecuted.ipynb | 2 +- .../notebook/{c6097af1.svg => 796cf2ee.svg} | 52 +- .../hyperparameter_tuning/notebook/index.html | 4 +- .../incremental_training/README/index.html | 2 +- .../incremental_training/notebook.ipynb | 18 +- .../incremental_training/notebook.jl | 2 +- .../notebook.unexecuted.ipynb | 14 +- .../incremental_training/notebook/index.html | 10 +- .../live_training/README/index.html | 2 +- .../live_training/notebook.jl | 2 +- .../live_training/notebook.unexecuted.ipynb | 4 +- .../live_training/notebook/index.html | 10 +- dev/contributing/index.html | 2 +- dev/extended_examples/Boston/index.html | 2 +- dev/extended_examples/MNIST/README/index.html | 2 +- .../notebook/{0555555c.svg => 1bde4ba3.svg} | 96 ++-- .../notebook/{c1bb4258.svg => a7a9f554.svg} | 72 +-- .../MNIST/notebook/index.html | 6 +- .../spam_detection/README/index.html | 2 +- .../spam_detection/notebook/index.html | 10 +- dev/index.html | 2 +- dev/interface/Builders/index.html | 4 +- dev/interface/Classification/index.html | 4 +- dev/interface/Custom Builders/index.html | 2 +- dev/interface/Image Classification/index.html | 2 +- .../Multitarget Regression/index.html | 2 +- dev/interface/Regression/index.html | 2 +- dev/interface/Summary/index.html | 4 +- dev/objects.inv | Bin 1784 -> 1786 bytes dev/search_index.js | 2 +- 56 files changed, 777 insertions(+), 321 deletions(-) rename dev/common_workflows/early_stopping/notebook/{f7d84d07.svg => f88f6e9e.svg} (86%) create mode 100644 dev/common_workflows/hyperparameter_tuning/notebook.ipynb rename dev/common_workflows/hyperparameter_tuning/notebook/{c6097af1.svg => 796cf2ee.svg} (85%) rename dev/extended_examples/MNIST/notebook/{0555555c.svg => 1bde4ba3.svg} (85%) rename dev/extended_examples/MNIST/notebook/{c1bb4258.svg => a7a9f554.svg} (85%) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 51d17873..f52d4e09 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-12T06:26:36","documenter_version":"1.4.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-12T20:32:22","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/about/index.html b/dev/about/index.html index c4d102ae..74135d07 100644 --- a/dev/about/index.html +++ b/dev/about/index.html @@ -1,2 +1,2 @@ -- · MLJFlux
+- · MLJFlux
diff --git a/dev/common_workflows/architecture_search/README/index.html b/dev/common_workflows/architecture_search/README/index.html index 5560d70e..7e976562 100644 --- a/dev/common_workflows/architecture_search/README/index.html +++ b/dev/common_workflows/architecture_search/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/architecture_search/notebook.ipynb b/dev/common_workflows/architecture_search/notebook.ipynb index 958109de..286491d1 100644 --- a/dev/common_workflows/architecture_search/notebook.ipynb +++ b/dev/common_workflows/architecture_search/notebook.ipynb @@ -95,7 +95,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng = 123);\n", + "y, X = unpack(iris, ==(:Species), rng = 123);\n", "X = Float32.(X); # To be compatible with type of network network parameters\n", "first(X, 5)" ], @@ -130,7 +130,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (1, 1, 1), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (1, 1, 1), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 4 @@ -306,7 +306,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (21, 57, 25), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (45, 49, 21), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 8 @@ -341,9 +341,9 @@ { "output_type": "execute_result", "data": { - "text/plain": "\u001b[1m10×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m mlp \u001b[0m\u001b[1m measurement \u001b[0m\n │\u001b[90m MLP… \u001b[0m\u001b[90m Float64 \u001b[0m\n─────┼────────────────────────────────────────────\n 1 │ MLP(hidden = (21, 57, 25), …) 0.0867019\n 2 │ MLP(hidden = (45, 17, 13), …) 0.0929803\n 3 │ MLP(hidden = (33, 13, 49), …) 0.0973896\n 4 │ MLP(hidden = (21, 41, 61), …) 0.0981502\n 5 │ MLP(hidden = (57, 49, 61), …) 0.100331\n 6 │ MLP(hidden = (25, 25, 29), …) 0.101083\n 7 │ MLP(hidden = (29, 61, 21), …) 0.101466\n 8 │ MLP(hidden = (29, 61, 5), …) 0.107513\n 9 │ MLP(hidden = (21, 61, 17), …) 0.107874\n 10 │ MLP(hidden = (45, 49, 61), …) 0.111292", + "text/plain": "\u001b[1m10×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m mlp \u001b[0m\u001b[1m measurement \u001b[0m\n │\u001b[90m MLP… \u001b[0m\u001b[90m Float64 \u001b[0m\n─────┼────────────────────────────────────────────\n 1 │ MLP(hidden = (45, 49, 21), …) 0.0860875\n 2 │ MLP(hidden = (25, 45, 33), …) 0.0877367\n 3 │ MLP(hidden = (29, 17, 53), …) 0.0970372\n 4 │ MLP(hidden = (61, 9, 29), …) 0.0970978\n 5 │ MLP(hidden = (49, 49, 9), …) 0.0971594\n 6 │ MLP(hidden = (21, 33, 61), …) 0.0984172\n 7 │ MLP(hidden = (57, 61, 61), …) 0.099232\n 8 │ MLP(hidden = (41, 13, 25), …) 0.101498\n 9 │ MLP(hidden = (53, 29, 21), …) 0.105323\n 10 │ MLP(hidden = (57, 33, 45), …) 0.110168", "text/html": [ - "
10×2 DataFrame
Rowmlpmeasurement
MLP…Float64
1MLP(hidden = (21, 57, 25), …)0.0867019
2MLP(hidden = (45, 17, 13), …)0.0929803
3MLP(hidden = (33, 13, 49), …)0.0973896
4MLP(hidden = (21, 41, 61), …)0.0981502
5MLP(hidden = (57, 49, 61), …)0.100331
6MLP(hidden = (25, 25, 29), …)0.101083
7MLP(hidden = (29, 61, 21), …)0.101466
8MLP(hidden = (29, 61, 5), …)0.107513
9MLP(hidden = (21, 61, 17), …)0.107874
10MLP(hidden = (45, 49, 61), …)0.111292
" + "
10×2 DataFrame
Rowmlpmeasurement
MLP…Float64
1MLP(hidden = (45, 49, 21), …)0.0860875
2MLP(hidden = (25, 45, 33), …)0.0877367
3MLP(hidden = (29, 17, 53), …)0.0970372
4MLP(hidden = (61, 9, 29), …)0.0970978
5MLP(hidden = (49, 49, 9), …)0.0971594
6MLP(hidden = (21, 33, 61), …)0.0984172
7MLP(hidden = (57, 61, 61), …)0.099232
8MLP(hidden = (41, 13, 25), …)0.101498
9MLP(hidden = (53, 29, 21), …)0.105323
10MLP(hidden = (57, 33, 45), …)0.110168
" ] }, "metadata": {}, diff --git a/dev/common_workflows/architecture_search/notebook.jl b/dev/common_workflows/architecture_search/notebook.jl index 61ba5d49..a5e4a15a 100644 --- a/dev/common_workflows/architecture_search/notebook.jl +++ b/dev/common_workflows/architecture_search/notebook.jl @@ -25,7 +25,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng = 123); +y, X = unpack(iris, ==(:Species), rng = 123); X = Float32.(X); # To be compatible with type of network network parameters first(X, 5) diff --git a/dev/common_workflows/architecture_search/notebook.unexecuted.ipynb b/dev/common_workflows/architecture_search/notebook.unexecuted.ipynb index 85b68135..6093c80e 100644 --- a/dev/common_workflows/architecture_search/notebook.unexecuted.ipynb +++ b/dev/common_workflows/architecture_search/notebook.unexecuted.ipynb @@ -75,7 +75,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng = 123);\n", + "y, X = unpack(iris, ==(:Species), rng = 123);\n", "X = Float32.(X); # To be compatible with type of network network parameters\n", "first(X, 5)" ], diff --git a/dev/common_workflows/architecture_search/notebook/index.html b/dev/common_workflows/architecture_search/notebook/index.html index ce6cedb0..e2ae9c7f 100644 --- a/dev/common_workflows/architecture_search/notebook/index.html +++ b/dev/common_workflows/architecture_search/notebook/index.html @@ -4,7 +4,7 @@ using RDatasets: RDatasets # Dataset source using DataFrames # To view tuning results in a table import Optimisers # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng = 123);
+y, X = unpack(iris, ==(:Species), rng = 123);
 X = Float32.(X);      # To be compatible with type of network network parameters
 first(X, 5)
5×4 DataFrame
RowSepalLengthSepalWidthPetalLengthPetalWidth
Float32Float32Float32Float32
16.73.35.72.1
25.72.84.11.3
37.23.05.81.6
44.42.91.40.2
55.62.53.91.1

Instantiating the model

Now let's construct our model. This follows a similar setup the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg = "MLJFlux"
 clf = NeuralNetworkClassifier(
@@ -85,7 +85,7 @@
 fit!(mach, verbosity = 0);
 fitted_params(mach).best_model
NeuralNetworkClassifier(
   builder = MLP(
-        hidden = (9, 57, 25), 
+        hidden = (41, 29, 21), 
         σ = NNlib.relu), 
   finaliser = NNlib.softmax, 
   optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), 
@@ -101,4 +101,4 @@
     mlp = [x[:model].builder for x in history],
     measurement = [x[:measurement][1] for x in history],
 )
-first(sort!(history_df, [order(:measurement)]), 10)
10×2 DataFrame
Rowmlpmeasurement
MLP…Float64
1MLP(hidden = (9, 57, 25), …)0.0766256
2MLP(hidden = (25, 29, 53), …)0.0867144
3MLP(hidden = (53, 21, 49), …)0.0884685
4MLP(hidden = (49, 49, 33), …)0.0950018
5MLP(hidden = (49, 49, 9), …)0.0971594
6MLP(hidden = (61, 37, 49), …)0.0990528
7MLP(hidden = (61, 45, 29), …)0.100818
8MLP(hidden = (57, 41, 9), …)0.102602
9MLP(hidden = (33, 13, 53), …)0.103152
10MLP(hidden = (45, 49, 49), …)0.103885

This page was generated using Literate.jl.

+first(sort!(history_df, [order(:measurement)]), 10)
10×2 DataFrame
Rowmlpmeasurement
MLP…Float64
1MLP(hidden = (41, 29, 21), …)0.0722895
2MLP(hidden = (49, 49, 41), …)0.0949757
3MLP(hidden = (53, 53, 45), …)0.0954839
4MLP(hidden = (61, 9, 29), …)0.0970978
5MLP(hidden = (33, 13, 49), …)0.0973896
6MLP(hidden = (45, 41, 61), …)0.0998508
7MLP(hidden = (33, 29, 29), …)0.100823
8MLP(hidden = (17, 57, 5), …)0.101959
9MLP(hidden = (37, 17, 49), …)0.106953
10MLP(hidden = (33, 21, 45), …)0.110329

This page was generated using Literate.jl.

diff --git a/dev/common_workflows/comparison/README/index.html b/dev/common_workflows/comparison/README/index.html index 6701101c..c5d08e44 100644 --- a/dev/common_workflows/comparison/README/index.html +++ b/dev/common_workflows/comparison/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/comparison/notebook.ipynb b/dev/common_workflows/comparison/notebook.ipynb index 8163b302..d968843e 100644 --- a/dev/common_workflows/comparison/notebook.ipynb +++ b/dev/common_workflows/comparison/notebook.ipynb @@ -81,7 +81,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);" + "y, X = unpack(iris, ==(:Species), rng=123);" ], "metadata": {}, "execution_count": 3 @@ -107,7 +107,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 4 @@ -271,9 +271,9 @@ { "output_type": "execute_result", "data": { - "text/plain": "\u001b[1m4×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m mlp \u001b[0m\u001b[1m measurement \u001b[0m\n │\u001b[90m Probabil… \u001b[0m\u001b[90m Float64 \u001b[0m\n─────┼────────────────────────────────────────────────\n 1 │ BayesianLDA(method = gevd, …) 0.0610826\n 2 │ NeuralNetworkClassifier(builder … 0.0857014\n 3 │ RandomForestClassifier(max_depth… 0.102881\n 4 │ ProbabilisticTunedModel(model = … 0.221056", + "text/plain": "\u001b[1m4×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m mlp \u001b[0m\u001b[1m measurement \u001b[0m\n │\u001b[90m Probabil… \u001b[0m\u001b[90m Float64 \u001b[0m\n─────┼────────────────────────────────────────────────\n 1 │ BayesianLDA(method = gevd, …) 0.0610826\n 2 │ NeuralNetworkClassifier(builder … 0.0857014\n 3 │ RandomForestClassifier(max_depth… 0.107885\n 4 │ ProbabilisticTunedModel(model = … 0.221056", "text/html": [ - "
4×2 DataFrame
Rowmlpmeasurement
Probabil…Float64
1BayesianLDA(method = gevd, …)0.0610826
2NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)0.0857014
3RandomForestClassifier(max_depth = -1, …)0.102881
4ProbabilisticTunedModel(model = XGBoostClassifier(test = 1, …), …)0.221056
" + "
4×2 DataFrame
Rowmlpmeasurement
Probabil…Float64
1BayesianLDA(method = gevd, …)0.0610826
2NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)0.0857014
3RandomForestClassifier(max_depth = -1, …)0.107885
4ProbabilisticTunedModel(model = XGBoostClassifier(test = 1, …), …)0.221056
" ] }, "metadata": {}, diff --git a/dev/common_workflows/comparison/notebook.jl b/dev/common_workflows/comparison/notebook.jl index 4d75c49d..6716ec52 100644 --- a/dev/common_workflows/comparison/notebook.jl +++ b/dev/common_workflows/comparison/notebook.jl @@ -23,7 +23,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); # ### Instantiating the models Now let's construct our model. This follows a similar setup diff --git a/dev/common_workflows/comparison/notebook.unexecuted.ipynb b/dev/common_workflows/comparison/notebook.unexecuted.ipynb index b8517a90..65e472ff 100644 --- a/dev/common_workflows/comparison/notebook.unexecuted.ipynb +++ b/dev/common_workflows/comparison/notebook.unexecuted.ipynb @@ -73,7 +73,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);" + "y, X = unpack(iris, ==(:Species), rng=123);" ], "metadata": {}, "execution_count": null diff --git a/dev/common_workflows/comparison/notebook/index.html b/dev/common_workflows/comparison/notebook/index.html index 6e9a243d..7c5ff317 100644 --- a/dev/common_workflows/comparison/notebook/index.html +++ b/dev/common_workflows/comparison/notebook/index.html @@ -4,7 +4,7 @@ import RDatasets # Dataset source using DataFrames # To visualize hyperparameter search results import Optimisers # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);

Instantiating the models Now let's construct our model. This follows a similar setup

to the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
+y, X = unpack(iris, ==(:Species), rng=123);

Instantiating the models Now let's construct our model. This follows a similar setup

to the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
 
 clf1 = NeuralNetworkClassifier(
     builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),
@@ -57,4 +57,4 @@
     mlp = [x[:model] for x in history],
     measurement = [x[:measurement][1] for x in history],
 )
-sort!(history_df, [order(:measurement)])
4×2 DataFrame
Rowmlpmeasurement
Probabil…Float64
1BayesianLDA(method = gevd, …)0.0610826
2NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)0.0857014
3RandomForestClassifier(max_depth = -1, …)0.115788
4ProbabilisticTunedModel(model = XGBoostClassifier(test = 1, …), …)0.221056

This is Occam's razor in practice.


This page was generated using Literate.jl.

+sort!(history_df, [order(:measurement)])
4×2 DataFrame
Rowmlpmeasurement
Probabil…Float64
1BayesianLDA(method = gevd, …)0.0610826
2NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)0.0857014
3RandomForestClassifier(max_depth = -1, …)0.101267
4ProbabilisticTunedModel(model = XGBoostClassifier(test = 1, …), …)0.221056

This is Occam's razor in practice.


This page was generated using Literate.jl.

diff --git a/dev/common_workflows/composition/README/index.html b/dev/common_workflows/composition/README/index.html index 7c612781..4567c981 100644 --- a/dev/common_workflows/composition/README/index.html +++ b/dev/common_workflows/composition/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/composition/notebook.ipynb b/dev/common_workflows/composition/notebook.ipynb index ced33e3c..306a24c6 100644 --- a/dev/common_workflows/composition/notebook.ipynb +++ b/dev/common_workflows/composition/notebook.ipynb @@ -10,7 +10,7 @@ { "cell_type": "markdown", "source": [ - "This tutorial is available as a Jupyter notebook or julia script\n", + "This demonstration is available as a Jupyter notebook or julia script\n", "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/composition)." ], "metadata": {} @@ -83,7 +83,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, @@ -146,7 +146,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "MLJFlux.NeuralNetworkClassifier" + "text/plain": "NeuralNetworkClassifier" }, "metadata": {}, "execution_count": 5 @@ -173,7 +173,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 6 @@ -284,7 +284,7 @@ "\rProgress: 13%|███████▏ | ETA: 0:00:01\u001b[K\rProgress: 100%|█████████████████████████████████████████████████████| Time: 0:00:00\u001b[K\n", "\rProgress: 67%|███████████████████████████████████▍ | ETA: 0:00:01\u001b[K\r\n", " class: virginica\u001b[K\r\u001b[A[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1, \"versicolor\" => 2).\n", - "\rOptimising neural net: 4%[> ] ETA: 0:05:10\u001b[K\rOptimising neural net: 6%[=> ] ETA: 0:03:22\u001b[K\rOptimising neural net: 8%[=> ] ETA: 0:02:29\u001b[K\rOptimising neural net: 10%[==> ] ETA: 0:01:56\u001b[K\rOptimising neural net: 12%[==> ] ETA: 0:01:35\u001b[K\rOptimising neural net: 14%[===> ] ETA: 0:01:20\u001b[K\rOptimising neural net: 16%[===> ] ETA: 0:01:08\u001b[K\rOptimising neural net: 18%[====> ] ETA: 0:00:59\u001b[K\rOptimising neural net: 20%[====> ] ETA: 0:00:52\u001b[K\rOptimising neural net: 22%[=====> ] ETA: 0:00:46\u001b[K\rOptimising neural net: 24%[=====> ] ETA: 0:00:41\u001b[K\rOptimising neural net: 25%[======> ] ETA: 0:00:37\u001b[K\rOptimising neural net: 27%[======> ] ETA: 0:00:33\u001b[K\rOptimising neural net: 29%[=======> ] ETA: 0:00:30\u001b[K\rOptimising neural net: 31%[=======> ] ETA: 0:00:28\u001b[K\rOptimising neural net: 33%[========> ] ETA: 0:00:25\u001b[K\rOptimising neural net: 35%[========> ] ETA: 0:00:23\u001b[K\rOptimising neural net: 37%[=========> ] ETA: 0:00:21\u001b[K\rOptimising neural net: 39%[=========> ] ETA: 0:00:20\u001b[K\rOptimising neural net: 41%[==========> ] ETA: 0:00:18\u001b[K\rOptimising neural net: 43%[==========> ] ETA: 0:00:17\u001b[K\rOptimising neural net: 45%[===========> ] ETA: 0:00:15\u001b[K\rOptimising neural net: 47%[===========> ] ETA: 0:00:14\u001b[K\rOptimising neural net: 49%[============> ] ETA: 0:00:13\u001b[K\rOptimising neural net: 51%[============> ] ETA: 0:00:12\u001b[K\rOptimising neural net: 53%[=============> ] ETA: 0:00:11\u001b[K\rOptimising neural net: 55%[=============> ] ETA: 0:00:10\u001b[K\rOptimising neural net: 57%[==============> ] ETA: 0:00:10\u001b[K\rOptimising neural net: 59%[==============> ] ETA: 0:00:09\u001b[K\rOptimising neural net: 61%[===============> ] ETA: 0:00:08\u001b[K\rOptimising neural net: 63%[===============> ] ETA: 0:00:08\u001b[K\rOptimising neural net: 82%[====================> ] ETA: 0:00:03\u001b[K\rOptimising neural net: 84%[=====================> ] ETA: 0:00:02\u001b[K\rOptimising neural net: 86%[=====================> ] ETA: 0:00:02\u001b[K\rOptimising neural net: 88%[======================> ] ETA: 0:00:02\u001b[K\rOptimising neural net: 90%[======================> ] ETA: 0:00:01\u001b[K\rOptimising neural net: 92%[=======================> ] ETA: 0:00:01\u001b[K\rOptimising neural net: 94%[=======================> ] ETA: 0:00:01\u001b[K\rOptimising neural net: 96%[========================>] ETA: 0:00:01\u001b[K\rOptimising neural net: 98%[========================>] ETA: 0:00:00\u001b[K\rOptimising neural net: 100%[=========================] Time: 0:00:12\u001b[K\n", + "\rOptimising neural net: 4%[> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 6%[=> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 8%[=> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 10%[==> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 12%[==> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 14%[===> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 16%[===> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 18%[====> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 20%[====> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 22%[=====> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 24%[=====> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 25%[======> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 27%[======> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 29%[=======> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 31%[=======> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 33%[========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 35%[========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 37%[=========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 39%[=========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 41%[==========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 43%[==========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 45%[===========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 47%[===========> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 49%[============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 51%[============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 53%[=============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 55%[=============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 57%[==============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 59%[==============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 61%[===============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 63%[===============> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 65%[================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 67%[================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 69%[=================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 71%[=================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 73%[==================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 75%[==================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 76%[===================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 78%[===================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 80%[====================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 82%[====================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 84%[=====================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 86%[=====================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 88%[======================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 90%[======================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 92%[=======================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 94%[=======================> ] ETA: 0:00:00\u001b[K\rOptimising neural net: 96%[========================>] ETA: 0:00:00\u001b[K\rOptimising neural net: 98%[========================>] ETA: 0:00:00\u001b[K\rOptimising neural net: 100%[=========================] Time: 0:00:00\u001b[K\n", "[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 3, \"versicolor\" => 1).\n", "[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 3, \"versicolor\" => 1).\n", "[ Info: After filtering, the mapping from each class to number of borderline points is (\"versicolor\" => 2).\n", @@ -298,18 +298,18 @@ "│ layer = Dense(4 => 5, relu) # 25 parameters\n", "│ summary(x) = \"4×8 Matrix{Float64}\"\n", "└ @ Flux ~/.julia/packages/Flux/Wz6D4/src/layers/stateless.jl:60\n", - "\rEvaluating over 5 folds: 40%[==========> ] ETA: 0:00:16\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1, \"versicolor\" => 2).\n", + "\rEvaluating over 5 folds: 40%[==========> ] ETA: 0:00:10\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1, \"versicolor\" => 2).\n", "[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1, \"versicolor\" => 2).\n", - "\rEvaluating over 5 folds: 60%[===============> ] ETA: 0:00:07\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1).\n", + "\rEvaluating over 5 folds: 60%[===============> ] ETA: 0:00:05\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1).\n", "┌ Warning: Cannot oversample a class with no borderline points. Skipping.\n", "└ @ Imbalance ~/.julia/packages/Imbalance/knJL1/src/oversampling_methods/borderline_smote1/borderline_smote1.jl:67\n", "\rProgress: 67%|███████████████████████████████████▍ | ETA: 0:00:00\u001b[K\r\n", " class: virginica\u001b[K\r\u001b[A[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 1).\n", "┌ Warning: Cannot oversample a class with no borderline points. Skipping.\n", "└ @ Imbalance ~/.julia/packages/Imbalance/knJL1/src/oversampling_methods/borderline_smote1/borderline_smote1.jl:67\n", - "\rEvaluating over 5 folds: 80%[====================> ] ETA: 0:00:03\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 3, \"versicolor\" => 3).\n", + "\rEvaluating over 5 folds: 80%[====================> ] ETA: 0:00:02\u001b[K[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 3, \"versicolor\" => 3).\n", "[ Info: After filtering, the mapping from each class to number of borderline points is (\"virginica\" => 3, \"versicolor\" => 3).\n", - "\rEvaluating over 5 folds: 100%[=========================] Time: 0:00:11\u001b[K\n" + "\rEvaluating over 5 folds: 100%[=========================] Time: 0:00:07\u001b[K\n" ] }, { diff --git a/dev/common_workflows/composition/notebook.jl b/dev/common_workflows/composition/notebook.jl index 182021eb..b617a4b6 100644 --- a/dev/common_workflows/composition/notebook.jl +++ b/dev/common_workflows/composition/notebook.jl @@ -26,7 +26,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); X = Float32.(X); # To be compatible with type of network network parameters # To simulate an imbalanced dataset, we will take a random sample: diff --git a/dev/common_workflows/composition/notebook.unexecuted.ipynb b/dev/common_workflows/composition/notebook.unexecuted.ipynb index 54b2439a..ef75b9ab 100644 --- a/dev/common_workflows/composition/notebook.unexecuted.ipynb +++ b/dev/common_workflows/composition/notebook.unexecuted.ipynb @@ -10,7 +10,7 @@ { "cell_type": "markdown", "source": [ - "This tutorial is available as a Jupyter notebook or julia script\n", + "This demonstration is available as a Jupyter notebook or julia script\n", "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/composition)." ], "metadata": {} @@ -75,7 +75,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, diff --git a/dev/common_workflows/composition/notebook/index.html b/dev/common_workflows/composition/notebook/index.html index 808ca441..2e8a1172 100644 --- a/dev/common_workflows/composition/notebook/index.html +++ b/dev/common_workflows/composition/notebook/index.html @@ -1,11 +1,11 @@ -Model Composition · MLJFlux

Model Composition with MLJFlux

This tutorial is available as a Jupyter notebook or julia script here.

In this workflow example, we see how MLJFlux enables composing MLJ models with MLJFlux models. We will assume a class imbalance setting and wrap an oversampler with a deep learning model from MLJFlux.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
+Model Composition · MLJFlux

Model Composition with MLJFlux

This demonstration is available as a Jupyter notebook or julia script here.

In this workflow example, we see how MLJFlux enables composing MLJ models with MLJFlux models. We will assume a class imbalance setting and wrap an oversampler with a deep learning model from MLJFlux.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
 using Flux              # For more flexibility
 import RDatasets        # Dataset source
 import Random           # To create imbalance
 import Imbalance        # To solve the imbalance
 import Optimisers       # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);
+y, X = unpack(iris, ==(:Species), rng=123);
 X = Float32.(X);      # To be compatible with type of network network parameters

To simulate an imbalanced dataset, we will take a random sample:

Random.seed!(803429)
 subset_indices = rand(1:size(X, 1), 100)
 X, y = X[subset_indices, :], y[subset_indices]
@@ -65,4 +65,4 @@
 ├────────────────────────────┼─────────┤
 │ [1.0, 1.0, 0.95, 1.0, 1.0] │ 0.0219  │
 └────────────────────────────┴─────────┘
-

This page was generated using Literate.jl.

+

This page was generated using Literate.jl.

diff --git a/dev/common_workflows/early_stopping/README/index.html b/dev/common_workflows/early_stopping/README/index.html index 459b303c..4d7a6a34 100644 --- a/dev/common_workflows/early_stopping/README/index.html +++ b/dev/common_workflows/early_stopping/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/early_stopping/notebook.ipynb b/dev/common_workflows/early_stopping/notebook.ipynb index bbdda628..9f136402 100644 --- a/dev/common_workflows/early_stopping/notebook.ipynb +++ b/dev/common_workflows/early_stopping/notebook.ipynb @@ -3,7 +3,7 @@ { "cell_type": "markdown", "source": [ - "# Early Stopping with MLJFlux" + "# Early Stopping with MLJ" ], "metadata": {} }, @@ -81,7 +81,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, @@ -108,7 +108,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 50, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 4 @@ -148,7 +148,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "5-element Vector{Any}:\n IterationControl.Step(1)\n EarlyStopping.NumberLimit(100)\n EarlyStopping.Patience(5)\n EarlyStopping.NumberSinceBest(9)\n EarlyStopping.TimeLimit(Dates.Millisecond(1800000))" + "text/plain": "5-element Vector{Any}:\n Step(1)\n NumberLimit(100)\n Patience(5)\n NumberSinceBest(9)\n TimeLimit(Dates.Millisecond(1800000))" }, "metadata": {}, "execution_count": 5 @@ -179,7 +179,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "1-element Vector{IterationControl.WithLossDo{Main.var\"##351\".var\"#3#4\"}}:\n IterationControl.WithLossDo{Main.var\"##351\".var\"#3#4\"}(Main.var\"##351\".var\"#3#4\"(), false, nothing)" + "text/plain": "1-element Vector{WithLossDo{Main.var\"##267\".var\"#1#2\"}}:\n WithLossDo{Main.var\"##267\".var\"#1#2\"}(Main.var\"##267\".var\"#1#2\"(), false, nothing)" }, "metadata": {}, "execution_count": 6 @@ -250,7 +250,7 @@ "[ Info: Training machine(ProbabilisticIteratedModel(model = NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …), …), …).\n", "[ Info: final loss: 0.05287897645527522\n", "[ Info: final training loss: 0.045833383\n", - "[ Info: Stop triggered by EarlyStopping.NumberLimit(100) stopping criterion. \n", + "[ Info: Stop triggered by NumberLimit(100) stopping criterion. \n", "[ Info: Total of 100 iterations. \n" ] } @@ -290,101 +290,101 @@ "\n", "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", + "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", + "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n" + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" ], "image/svg+xml": [ "\n", "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", + "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", + "\n", "\n", - " \n", + " \n", " \n", " \n", "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n" + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" ] }, "metadata": {}, diff --git a/dev/common_workflows/early_stopping/notebook.jl b/dev/common_workflows/early_stopping/notebook.jl index a6c59da3..adcf39f7 100644 --- a/dev/common_workflows/early_stopping/notebook.jl +++ b/dev/common_workflows/early_stopping/notebook.jl @@ -10,7 +10,7 @@ using Pkg #!md Pkg.activate(@__DIR__); #!md Pkg.instantiate(); #!md -# **Julia version** is assumed to be 1.10.* +# **Julia version** is assumed to be 1.10.* # ### Basic Imports @@ -24,7 +24,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); X = Float32.(X); # To be compatible with type of network network parameters diff --git a/dev/common_workflows/early_stopping/notebook.unexecuted.ipynb b/dev/common_workflows/early_stopping/notebook.unexecuted.ipynb index 5effdb73..4441ab52 100644 --- a/dev/common_workflows/early_stopping/notebook.unexecuted.ipynb +++ b/dev/common_workflows/early_stopping/notebook.unexecuted.ipynb @@ -3,7 +3,7 @@ { "cell_type": "markdown", "source": [ - "# Early Stopping with MLJFlux" + "# Early Stopping with MLJ" ], "metadata": {} }, @@ -73,7 +73,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, diff --git a/dev/common_workflows/early_stopping/notebook/f7d84d07.svg b/dev/common_workflows/early_stopping/notebook/f88f6e9e.svg similarity index 86% rename from dev/common_workflows/early_stopping/notebook/f7d84d07.svg rename to dev/common_workflows/early_stopping/notebook/f88f6e9e.svg index 7bc7fa7a..4e284fb1 100644 --- a/dev/common_workflows/early_stopping/notebook/f7d84d07.svg +++ b/dev/common_workflows/early_stopping/notebook/f88f6e9e.svg @@ -1,48 +1,48 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/common_workflows/early_stopping/notebook/index.html b/dev/common_workflows/early_stopping/notebook/index.html index fcf67e5e..9c64170a 100644 --- a/dev/common_workflows/early_stopping/notebook/index.html +++ b/dev/common_workflows/early_stopping/notebook/index.html @@ -1,10 +1,10 @@ -Early Stopping · MLJFlux

Early Stopping with MLJFlux

This demonstration is available as a Jupyter notebook or julia script here.

In this workflow example, we learn how MLJFlux enables us to easily use early stopping when training MLJFlux models.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
+Early Stopping · MLJFlux

Early Stopping with MLJ

This demonstration is available as a Jupyter notebook or julia script here.

In this workflow example, we learn how MLJFlux enables us to easily use early stopping when training MLJFlux models.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
 using Flux              # For more flexibility
 import RDatasets        # Dataset source
 using Plots             # To visualize training
 import Optimisers       # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);
+y, X = unpack(iris, ==(:Species), rng=123);
 X = Float32.(X);      # To be compatible with type of network network parameters

Instantiating the model Now let's construct our model. This follows a similar setup

to the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
 
 clf = NeuralNetworkClassifier(
@@ -40,8 +40,8 @@
  EarlyStopping.TimeLimit(Dates.Millisecond(1800000))

We can also define callbacks. Here we want to store the validation loss for each iteration

validation_losses = []
 callbacks = [
     WithLossDo(loss->push!(validation_losses, loss)),
-]
1-element Vector{IterationControl.WithLossDo{Main.var"#3#4"}}:
- IterationControl.WithLossDo{Main.var"#3#4"}(Main.var"#3#4"(), false, nothing)

Construct the iterated model and pass to it the stop_conditions and the callbacks:

iterated_model = IteratedModel(
+]
1-element Vector{IterationControl.WithLossDo{Main.var"#1#2"}}:
+ IterationControl.WithLossDo{Main.var"#1#2"}(Main.var"#1#2"(), false, nothing)

Construct the iterated model and pass to it the stop_conditions and the callbacks:

iterated_model = IteratedModel(
     model=clf,
     resampling=Holdout(fraction_train=0.7); # loss and stopping are based on out-of-sample
     measures=log_loss,
@@ -56,4 +56,4 @@
 [ Info: final training loss: 0.045833383
 [ Info: Stop triggered by EarlyStopping.NumberLimit(100) stopping criterion.
 [ Info: Total of 100 iterations.

Results

We can see that the model converged after 100 iterations.

plot(training_losses, label="Training Loss", linewidth=2)
-plot!(validation_losses, label="Validation Loss", linewidth=2, size=(800,400))
Example block output

This page was generated using Literate.jl.

+plot!(validation_losses, label="Validation Loss", linewidth=2, size=(800,400))
Example block output

This page was generated using Literate.jl.

diff --git a/dev/common_workflows/hyperparameter_tuning/README/index.html b/dev/common_workflows/hyperparameter_tuning/README/index.html index 241c45a3..6726abbe 100644 --- a/dev/common_workflows/hyperparameter_tuning/README/index.html +++ b/dev/common_workflows/hyperparameter_tuning/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/hyperparameter_tuning/notebook.ipynb b/dev/common_workflows/hyperparameter_tuning/notebook.ipynb new file mode 100644 index 00000000..18a49f77 --- /dev/null +++ b/dev/common_workflows/hyperparameter_tuning/notebook.ipynb @@ -0,0 +1,444 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "source": [ + "# Hyperparameter Tuning with MLJFlux" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "This demonstration is available as a Jupyter notebook or julia script\n", + "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/hyperparameter_tuning)." + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "In this workflow example we learn how to tune different hyperparameters of MLJFlux\n", + "models with emphasis on training hyperparameters." + ], + "metadata": {} + }, + { + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " Activating project at `~/GoogleDrive/Julia/MLJ/MLJFlux/docs/src/common_workflows/hyperparameter_tuning`\n" + ] + } + ], + "cell_type": "code", + "source": [ + "using Pkg\n", + "Pkg.activate(@__DIR__);\n", + "Pkg.instantiate();" + ], + "metadata": {}, + "execution_count": 1 + }, + { + "cell_type": "markdown", + "source": [ + "**Julia version** is assumed to be 1.10.*" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "### Basic Imports" + ], + "metadata": {} + }, + { + "outputs": [], + "cell_type": "code", + "source": [ + "using MLJ # Has MLJFlux models\n", + "using Flux # For more flexibility\n", + "import RDatasets # Dataset source\n", + "using Plots # To plot tuning results\n", + "import Optimisers # native Flux.jl optimisers no longer supported" + ], + "metadata": {}, + "execution_count": 2 + }, + { + "cell_type": "markdown", + "source": [ + "### Loading and Splitting the Data" + ], + "metadata": {} + }, + { + "outputs": [], + "cell_type": "code", + "source": [ + "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", + "X = Float32.(X); # To be compatible with type of network network parameters" + ], + "metadata": {}, + "execution_count": 3 + }, + { + "cell_type": "markdown", + "source": [ + "### Instantiating the model" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "Now let's construct our model. This follows a similar setup the one followed in the\n", + "[Quick Start](../../index.md#Quick-Start)." + ], + "metadata": {} + }, + { + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[ Info: For silent loading, specify `verbosity=0`. \n", + "import MLJFlux ✔\n" + ] + }, + { + "output_type": "execute_result", + "data": { + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" + }, + "metadata": {}, + "execution_count": 4 + } + ], + "cell_type": "code", + "source": [ + "NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n", + "clf = NeuralNetworkClassifier(\n", + " builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n", + " optimiser=Optimisers.Adam(0.01),\n", + " batch_size=8,\n", + " epochs=10,\n", + " rng=42,\n", + ")" + ], + "metadata": {}, + "execution_count": 4 + }, + { + "cell_type": "markdown", + "source": [ + "### Hyperparameter Tuning Example" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "Let's tune the batch size and the learning rate. We will use grid search and 5-fold\n", + "cross-validation." + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "We start by defining the hyperparameter ranges" + ], + "metadata": {} + }, + { + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": "NominalRange(optimiser = Adam(0.0001, (0.9, 0.999), 1.0e-8), Adam(0.00215443, (0.9, 0.999), 1.0e-8), Adam(0.0464159, (0.9, 0.999), 1.0e-8), ...)" + }, + "metadata": {}, + "execution_count": 5 + } + ], + "cell_type": "code", + "source": [ + "r1 = range(clf, :batch_size, lower=1, upper=64)\n", + "etas = [10^x for x in range(-4, stop=0, length=4)]\n", + "optimisers = [Optimisers.Adam(eta) for eta in etas]\n", + "r2 = range(clf, :optimiser, values=optimisers)" + ], + "metadata": {}, + "execution_count": 5 + }, + { + "cell_type": "markdown", + "source": [ + "Then passing the ranges along with the model and other arguments to the `TunedModel`\n", + "constructor." + ], + "metadata": {} + }, + { + "outputs": [], + "cell_type": "code", + "source": [ + "tuned_model = TunedModel(\n", + " model=clf,\n", + " tuning=Grid(goal=25),\n", + " resampling=CV(nfolds=5, rng=42),\n", + " range=[r1, r2],\n", + " measure=cross_entropy,\n", + ");" + ], + "metadata": {}, + "execution_count": 6 + }, + { + "cell_type": "markdown", + "source": [ + "Then wrapping our tuned model in a machine and fitting it." + ], + "metadata": {} + }, + { + "outputs": [], + "cell_type": "code", + "source": [ + "mach = machine(tuned_model, X, y);\n", + "fit!(mach, verbosity=0);" + ], + "metadata": {}, + "execution_count": 7 + }, + { + "cell_type": "markdown", + "source": [ + "Let's check out the best performing model:" + ], + "metadata": {} + }, + { + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.0464159, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 1, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" + }, + "metadata": {}, + "execution_count": 8 + } + ], + "cell_type": "code", + "source": [ + "fitted_params(mach).best_model" + ], + "metadata": {}, + "execution_count": 8 + }, + { + "cell_type": "markdown", + "source": [ + "### Learning Curves" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "With learning curves, it's possible to center our focus on the effects of a single\n", + "hyperparameter of the model" + ], + "metadata": {} + }, + { + "cell_type": "markdown", + "source": [ + "First define the range and wrap it in a learning curve" + ], + "metadata": {} + }, + { + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[ Info: Training machine(ProbabilisticTunedModel(model = NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …), …), …).\n", + "[ Info: Attempting to evaluate 25 models.\n", + "\rEvaluating over 25 metamodels: 0%[> ] ETA: N/A\u001b[K\rEvaluating over 25 metamodels: 4%[=> ] ETA: 0:00:03\u001b[K\rEvaluating over 25 metamodels: 8%[==> ] ETA: 0:00:02\u001b[K\rEvaluating over 25 metamodels: 12%[===> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 16%[====> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 20%[=====> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 24%[======> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 28%[=======> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 32%[========> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 36%[=========> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 40%[==========> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 44%[===========> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 48%[============> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 52%[=============> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 56%[==============> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 60%[===============> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 64%[================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 68%[=================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 72%[==================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 76%[===================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 80%[====================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 84%[=====================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 88%[======================> ] ETA: 0:00:01\u001b[K\rEvaluating over 25 metamodels: 92%[=======================> ] ETA: 0:00:00\u001b[K\rEvaluating over 25 metamodels: 96%[========================>] ETA: 0:00:00\u001b[K\rEvaluating over 25 metamodels: 100%[=========================] Time: 0:00:06\u001b[K\n" + ] + }, + { + "output_type": "execute_result", + "data": { + "text/plain": "(parameter_name = \"epochs\",\n parameter_scale = :log10,\n parameter_values = [1, 2, 3, 4, 5, 6, 7, 9, 11, 13 … 39, 46, 56, 67, 80, 96, 116, 139, 167, 200],\n measurements = [0.9231712033780419, 0.7672938542047157, 0.6736075721456418, 0.6064130950372606, 0.5595521804926612, 0.5270759259385482, 0.5048969423979114, 0.47993815474701584, 0.46130985568830307, 0.4449225600160762 … 0.1621185148276446, 0.12283639917434747, 0.09543014842693512, 0.07850181447968614, 0.06950203807005066, 0.063248279208185, 0.060053521895940286, 0.05921442672620914, 0.05921052970422136, 0.060379476300399186],)" + }, + "metadata": {}, + "execution_count": 9 + } + ], + "cell_type": "code", + "source": [ + "r = range(clf, :epochs, lower=1, upper=200, scale=:log10)\n", + "curve = learning_curve(\n", + " clf,\n", + " X,\n", + " y,\n", + " range=r,\n", + " resampling=CV(nfolds=4, rng=42),\n", + " measure=cross_entropy,\n", + ")" + ], + "metadata": {}, + "execution_count": 9 + }, + { + "cell_type": "markdown", + "source": [ + "Then plot the curve" + ], + "metadata": {} + }, + { + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": "Plot{Plots.GRBackend() n=1}", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlgAAAGQCAIAAAD9V4nPAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3dZ0AU194G8DNl6SAgHQEBwQqoiCgqIE0EMfaaaIyKJYkmlsDNm6j3em8ssSSaxIhKLLFrbKiIWLBhw65oFLtIV0D6zM77YQ1BXBQNu7Pl+X3amR2GPzrw7Jk5hRIEgQAAAGgrWuwCAAAAxIQgBAAArYYgBAAArYYgBAAArYYgBAAArYYgBAAArYYgBAAArYYgBAAArYYgBAAArYYgBAAAraYGQbhnz54jR47U82CpVMrzvELrAai/qqoqsUsA+BsuSLnUIAhPnjx59uzZeh7M8zz+p0F1lJeXi10CwN9wQcqlBkEIAACgOAhCAADQaghCAADQaghCAADQaghCAADQaghCAADQaghCAADQaghCAADQapoWhDsekMBEOg9jRgEAoH5YsQtoYH2cyJUCodNubl8Pxr0RJXY5AAAie/z4cWpqqux1WVmZvr6+uPW8N2tra39/f0WcWdOCkCLkG0/B1ZT2T+C2h7BdrJGFAKDV4uLiduzY0bJlS0KIIAgUpZZ/FSsqKs6cOZOVlaWIk2taEMqMcqftDam+B7lfujADnDXt9i8AQP0JgjB48OBvvvlG7EL+kZycHA8PDwWdXGNDIsye2hfOTk6V/nxDKnYtAACgujQ2CAkhHSyoE1HMTzekk1N5qSB2NQAAoJI0OQgJIc7G1Mko9mK+MPgwX45lCgEA4DUaHoSEEHNdcrAny1AkIpF7Xil2NQAAoGI0PwgJIboM2RjEdLCkuu7hHr7ATVIAAPibVgQhIYQiZH5HZmxzusse/mI+shAAAF7SliCUmdyG/rEz3WM/l/gYWQgAoFoqKyuvXr368OFDJX9f7QpCQki/pvSuMHb0Mf73OxhWAQCgKqZNm2ZsbOzr6zt//nwlf2utC0JCSGcr6mAE822adNYFdCQFAFAJY8eOzcrKGj16tPK/tTYGISGklSmV2pvd81AYfYzn0DIEAFCWuLi4nj17Vm8OGzbsu+++I4Q0b97czMxMlJI0c4q1+rDRJymR7ODDXOQBblsIaywRuyAAAKX4s1C4XKCkfhKmOlSo/Suzmw4bNiwmJiYjI8PV1TU/P3/Pnj0LFixQTjF10d4gJIQYSciuUPbTU3zwPm5PGGutrnOyAwC8g/Tnwta7SgpCByMh1J6pucfIyGj48OG//fbbf//739WrV/fo0cPOzk45xdRFq4OQEMLSZHlXZt5lqd9ubl840xwrNwGApvvAif7AScwCPv300+Dg4JkzZ8bHx//www9ilkII0dpnhLXEeNHftqMDErgTWRhWAQCgWC1btnRzc5s+fXp5eXlwcLDY5Wh9i7Dax+50E0OqXzL3kx8zyAWfDwAAFGj8+PHDhg2bM2cOTb/8e3v48OGkpKRTp04RQmJjY8PCwoKCgpRTDILwbyH2VHIEG5XE55STz1ohCwEAFMXf319HR2fUqFHVe/T09MzMzAYNGlS9qbRiEISv8DSnUiKZiAP87UJhcSeGxhNDAICGxnHcnDlzBg8ebG1tXb3Tz8/Pz89PlHrQ7qmtqTF1Moq9lC8MwspNAAANraKiwtra+tKlS3PmzBG7lpcQhHKY6ZKknqyEJsH7uPwKsasBANAgurq6+fn5J06csLe3F7uWlxCE8ukyZEN3JtSe8sfKTQAAGg1BWCeKkFntmegWtN8e/kIeshAAQDMhCN9icht6aWc6PJHb/whZCACggRCEb9e3Kb07jP3kGLfiJubnBgDQNBg+US+drKjjUWzEAf5JqTCrPfP2LwAAUBnnz5+Pi4sTu4p/pLi4WHEnRxDWVzMT6lgvttcB7uELfnlXRoK2NACog9DQ0JycnLS0NEJIVVWVRKKuS+1MmjRJQWdGEL4DG32S0osdfIiLPMBtx8pNAKAO/P39/f39Za+Li4uNjY3FrUcFoV3zbgxZsiuMdTWhuu7hnpSg+wwAgNpDEL4zhiLLujDDXOluCfzN58hCAAD1hiB8TzFe9Mz2dOBe7jhWbgIAUGcIwvc30o1e350dcIjbfBfDKgAA1BWC8B8JtqOSe7LTz0jnX0EWAgCoJQThP+VhTqX2ZjbckU5O5aW4SwoAoG4QhA3A3pA6EsleLhAGHuLLOLGrAQCAd4EgbBhmuuRAOKvHkOB9XF652NUAAEC9IQgbjC5Dfu/OhDWhAhK4B1i5CQBATSAIG5Js5abxLekue/g0rNwEAKAOEIQN7/PW9E9+dM9Ebh9WbgIAUHkKDEKe5+/du1dYWFjXAYIgZGZmZmZmCoKmBUYfJ3pPGDvmOLccKzcBAKg2RQXh9evX3dzcIiIimjZtunDhwtcPyMjI8PT0DAwMDAwM9PT0zMjIUFAlYvG1oo73Yhdelcae4zUt5wEANIiignDy5MkjR45MT08/e/bsrFmz7t69W+uAWbNmdejQ4c8///zzzz99fHxmzpypoEpE5GpCpfZmj2cJnxzjq9AyBABQSQoJwqysrCNHjkycOJEQ4ubmFhQUtHnz5lrHlJSUuLm5yV67u7uXlpYqohLRNdYlST3ZvHIh4gBXVCV2NQAA8BqFrEf48OFDExMTS0tL2aabm9v9+/drHfPtt99+8sknUqmUELJ9+/b4+Pi6zlZRUfHkyRPZqpKEEH19/VatWimibAUxZMnOUPbzU3zXPdy+HkwTQ0rsigAA4G8KCcLi4mJ9ff3qTQMDgydPntQ6huM4hmHu3btHURRN01VVdTaX7ty5c+HChRMnTsg2aZresmWLlZWV3IOrqqp4nuc4lZvfZb4X+eUW03W3dHtAVXMTPDTUFiUlJRSFjz6gKrTwgtTT02PZtySdQoLQ2tr6+fPn1ZsFBQXW1ta1jhk7duynn346duxYQsjKlSujo6MvXbok92ytW7fu2rVrTExMfb61LAj19PTet3YF+sqb2DaSRh6htgSz/jbadS1qLUEQjIyMxK4C4CVckHIp5Bmhi4uLRCK5fPmybPPMmTPt2rWrdUxOTo6Tk5PsddOmTXNychRRiar5qBm9oTs78BC3KQOdZwAAVIJCgtDAwGDUqFGTJ08+d+7c3LlzHz9+PHDgQEJIamqqt7e37JiIiIiZM2eePn36zJkzM2bMiIiIUEQlKijIjjoUwcack866wItdCwAAKObWKCFk3rx5//vf/7788ksHB4dDhw4ZGBgQQkxMTNq3by87YMmSJYsWLZoxYwYhJCIiYsqUKQqqRAW1MaNSezORB/hnFfziTgyNu6QAAOKhVH9Wl9jYWDMzMw14RlhLcRUZkMwZSqj1gYy+oj6QgMiKi4uNjY3FrgLgJVyQcmGuUdEYS8ieHqwBS4L2cblYuQkAQCQIQjHp0GRdINOjCeW3m7tdqOpNcwAAjYQgFJls5aYYL7r7PqzcBAAgAgShShjTnP7Zj+6ZyCU8RBYCACgVglBVfOBEJ/Rgo09wv6ZjiCEAgPIgCFVIR0vqRBS7+Jp0cipWbgIAUBIEoWpxMaaO92JP5whjjvEcWoYAAIqHIFQ5VvrkcCSbVSb0S+ZLVW7ycAAATYMgVEWGLNkVylrpk6B9XB6GGAIAKBKCUEWxNFnRjQlvQgUkcI9K8MQQAEBREISqSzbEMLoF7bebv1qALAQAUAgEoaqb3Iae60OH7udOZiMLAQAaHoJQDQxvRq/vzg5I5vY+QhYCADQwBKF6CLajdoexY49zK29hUAUAQENCEKoNH0vqeC923mWs6AsA0JAQhOrE1YQ61ovd9UCYnMpLcZcUAKAhIAjVjK0BORLJXsoXPjrKV+IuKQDAP4YgVD+mOiSpJ1spJT0TuaIqsasBAFBzCEK1pMuQTUGMmwkVvJfLKRO7GgAAdYYgVFcMRX7tygxwpjvv5u4U4YEhAMB7QhCqtxgveooHHZDAX8pHFgIAvA8Eodr7tBX9cxc6bD938AmyEADgnSEINUEfJ3pLMPvhUW7LXXQkBQB4NwhCDRFoSx2KYKeflf6ajiwEAHgHCELN0caMOhrJLL4mjT2HqWcAAOoLQahRnI2pY73Y5CfCJ8d4Di1DAIB6QBBqGmt9cjSSzSwVBh7iyzixqwEAUHkIQg1kJCG7w1g9lvQ8wBVWil0NAIBqQxBqJh2abOjOdLCguu7hnpRgWAUAQJ0QhBqLImSBLzOmOd0tgb9ViCwEAJCPFbsAUKzJbWgzXRKQwO0IZTtbUWKXAwCgctAi1Hwj3OgV3ZgPkrjEx2gXAgDUhiDUClGO9J4w9pNj3MYMDKoAAHgFglBb+FpRyRFs7Dnp91eQhQAAf0MQapFWplRqb2bdHenkVB43SQEAZBCE2sXOgEqJZNPyhI9T+Cq0DAEAEIRayEyXJPVk88uFfslcKaaeAQCthyDURgYs2RXG2uhT3fdyeeViVwMAICoEoZZiKBLXjenpQPkncA9f4IkhAGgvBKH2ogiZ1Z4Z35L228NfKUAWAoCWQhBqu0mt6Xk+dOh+7kQWshAAtBGCEMjwZvTaALb/IW7fI2QhAGgdBCEQQkiPJtTeHuzoY1z8nxhUAQDaBZNuw0sdLKgjkWx4Il9QQaZ54BMSAGgL/L2Dv7UwpVJ7s2tvS2PPYeoZANAWCEJ4ha0BSYlkj2cJo1J4DndJAUALIAihNtnUMznlwoBDfBmmngEATYcgBDkMWbI7lDXTJT0PcIWVYlcDAKBICEKQj6VJvD/TwYLqlsBlluKJIQBoLAQh1IkiZIEvM9yV7raHv1OELAQAzYQghLeI8aK/bUd338tfykcWAoAGQhDC233sTi/1o3smcscxDRsAaBwEIdRLHyd6QxDbP5n74z4GVQCARpEfhCtWrMjLy1NyKaDiuttS+8PZz07xq24hCwFAc8gPwhkzZjRp0mT48OEpKSlKLghUmbcFdSKKnXtZOv8KshAANIT8IExLS/v3v/996tSpwMDA5s2bz5s3Dw1EkHExpo71YjfckU5OxTRsAKAJ5AehnZ1dTExMRkbGwYMHvby8vvnmmyZNmgwaNCg5OVkQ8NdP29kakCORbFqe8DGmYQMA9femzjI0TYeEhGzZsuXevXtjxozZunVraGioh4fHqlWrKisx3YhWk03Dllcu9EvGNGwAoN7e0mtUEIQjR45Mnz595cqVRkZGY8aMcXJyio6ODggIqKioUE6JoJoMWLIrlLXUI+GJmIYNANRYnUGYm5u7YMGCFi1aBAUFXblyZcGCBY8fP16xYsXevXtPnTp14cKFpKQkZRYKKoilyUp/pqMl1XUP96QE98wBQC3JX5h39OjR69evFwShX79+cXFxAQEBNd/19fV1dnbOyclRSoWg0ihCvvdlLPSk3RL4A+GMWyNK7IoAAN6N/CC8du3ajBkzRo8ebW1tLfeA3377zdHRUZGFgTqJ8aJtDEj3fXxCGNO2MbIQANSJ/CA8ffo0Rb3pz1nnzp0VUw+oq5FutKkO6ZnIbQ5m/W2QhQCgNuQHIUVRgiCcPHny4sWLmZmZVlZWHh4e3bt3ZxhGyfWBGvnAiW6kQw08xP3ix/R3xux9AKAe5Adhbm7ugAEDjh07RghhWZbjOEJI27Ztd+zY0bRp0/qct7Ky8ocffkhNTXVycoqNjbWxsXn9mKysrMWLF//5559WVlaff/55mzZt3v/nANUQaEsl92QjD/AFFWRsC2QhAKgB+X+qPv7440uXLi1fvjwvL6+qqur58+cbNmzIzs7u169fPQfUT5s2bffu3dHR0eXl5WFhYVJp7XHXWVlZHTt2LCoqGjlyZPv27XNzc//pjwKqwcOcOtqL+f6qdNYFXuxaAADejno92J4/f25ubr527doPP/yw5v4jR44EBQWlp6e3aNHizSctLCy0s7O7cOFC8+bNBUFo2rRpXFxcjx49ah4zceLE4uLidevWvbXE2NhYMzOzmJiY+vw8VVVVPM/r6enV52BQnMxSoWci72dNLenMSLS4ZVhcXGxsbCx2FQAv4YKUS86fqKqqKkEQfHx8au2X7anPOPpr164ZGRk1b96cEEJRVJcuXc6ePVvrmJSUlNDQ0Llz537xxRf79+9/z/JBVdkZUCej2KxSErSPyykTuxoAgLrJeUZoaWnZpk2b5ORkWZJVS05OtrS0bNmy5VtPmp2dbW5uXr3ZuHHjrKysWsfcv39/5syZkyZNatOmzZgxY2bOnBkdHS33bLdu3bp69eqBAwdkmwzDrFixwsLCQu7Bshah7KEmiG5tZzLnGtNhp7ChS2Vbc20ccV9SUvLmDtgAyqSFF6Senh7Lyu8NU03+20uXLh02bJjsoaCNjU1eXt6+fft+/PHHZcuWlZSUlJSUEEIMDAx0dXXlfrmBgUHNhmN5eXnNXJTR19cfMmTIl19+SQjR1dWdO3duXUHo4OBgYWExePBg2SbDMI6OjjQt/3Ybbo2qmu86ES8raf/jOr92Yfo21bqbpIIgGBkZiV0FwEu4IOWSH4RDhgzJzs6ePXv27Nmza+4fMGBA9esVK1aMGTNG7pc7ODhkZWWVl5fLAun+/fuenp61jnF0dHRwcKh+/YbOMgYGBvb29iEhIfX4cUAVDXahmzei+hzkLxcIM9sz2vVxFABUnvwg/P7778vK3vJgp2vXrnW91bp1a2dn540bN44aNerOnTunTp367bffCCH3798/ffr0kCFDCCFDhgxJTEycMGECRVF79+59/ZEkaJK2janU3mzfZC79Of+bP2PwlhsVAADKI/8P0kcfffQPz7tkyZKhQ4euXbv2+vXrsvXuCSHnz5+fOnWqLAgnTpyYkJDg5eVlYmKSm5u7e/fuf/gdQcXZGpCUSHbcCb7LHm5XKONohJYhAKiEt3wyz8vLe/z4sa2tbV2TjtYlODg4IyPj+vXrjo6OdnZ2sp29e/cOCgqSvTYyMjp27Nj169elUmmLFi0kEsl7VA/qRZchqwOYH69Ju+zhtwUzvlbIQgAQX52dF1asWOHg4GBpadmuXTsbGxsrK6v58+e/Pi7+DYyNjTt16lSdgoQQHR2dWr1mWrdu7eHhgRTUKpPb0Cu6MR8c5NbcxvL2ACA++S3CZcuWTZw40dvbe/LkybJeo7t27YqJiSkuLq7VfQbgPYQ3oY71Ynsn8RfyhEWdGPSfAQARyZlZRiqV2tvbh4WFrV69uuaIk6+//nrx4sV5eXmGhobKLBEzy2iqggoy6BAnocmmILaRjtjVKAYm8gCVggtSLjm3RnNycrKysiZNmlRr3OWkSZPKy8tv3bqlrNpAw5nrksRw1qsx1XEXd6tQG4fbA4AqkBOEOjo6hJCioqJa+2V76hpED/AeWJrM9WGmedCBCdyhTGQhAIhAThCam5t7eXlNnz49Ozu7emdRUdEXX3xha2v71hm3Ad7V2Bb0thB2xFF+3mV0nwEAZatzirWwsDBnZ+fAwEBbW9ucnJzjx4+/ePFiy5YtWJsXFKGLNXX6A6bPQT6jWPjJj9HRurnYAEA08v/edOvWLS0tbdCgQbdu3dqxY8eVK1fCw8NPnTrVr18/JdcH2sPBkEqJZPPKSdBeLFgBAMojp0VYWlq6dOnSqKio1atXK70e0GpGErI9hPn3Bb7DTm5nKNPeAuMqAEDh5LQIi4qKYmNji4uLlV8NAEXIrPbMvI50zwPcpgw8MgQAhZMThFZWVtbW1vfu3VN+NQAyQ13pxHD2Pxelw4/wzyvFrgYANJqcIKRpeu7cud9+++3169eVXxCATLvG1IW+rIMRafsHd+QpRlYAgKLI7zW6b9++oqIiT09PV1fXJk2a1OwpevDgQWXVBtpOjyFzfZgQO2HkUb6nA7W4E9ZvAoCGV2cvdU9Pz6CgICcnJ4yXAHGF2FNX+rMlHOm4i7uYj6YhADQw+R+wt2zZouQ6AN7AVIf8HshsvScNT+QmtKS/bYd5ugGgwchvEa5bt67mtDIy2dnZcXFxii8JQL6BzvS5D9iUp4J/ApdRhKYhADQM+UE4ffr0jIyMWjvv3r07btw4xZcEUCdHI+pwJDvIme68m4u7icEVANAA3mEmqxcvXhgZGSmuFID6oAiZ3IY+EskuS5cOPMTnV4hdEACouVeeEV69ejU1NZUQUlZWtmvXrmvXrlW/VV5evmHDBsy4DSqitRmV2puddYH33M6t6MZEOOCZIQC8p1eCMDk5ecqUKbLX8+fPr3Wou7s7nhGC6pANrgizF0Yd48ObYHAFALynV26NTpgwoaCgoKCgwNLSMjExsaCGsrKyW7duBQQEiFUogFxBdtSVfmwZRzrs5C7koQcNALyzVz5C6+np6enpEULOnTtnbW0tew2g4hrpkLWBzIYMac8D3GetmGketD6ahgBQb/I7yzg5OSEFQb0Mc6XP92GvFggtt3EbM6RoGwJAPckPwvz8/M8//1w2uRr1KiXXB1B/DobUlmBmSzDz0w2p7y7ueBbSEADeTv4tpEGDBp08eXLo0KHu7u40jcXCQZ10tKRORLHb7klHpPBtzMiPnRkXY3yAA4A6yQnCsrKylJSUZcuWjR07VvkFAfxzFCEDneleDvSS61LfXdwwV/o/3kwjHbHLAgCVJKe1V1JSwvN8hw4dlF8NQAPSZ0mMF50+QEIIcd9a9eM1KYe5aADgNXKC0MLCon379rKR9QDqzkKP/NiZORLJHngi9fiD2/sIDw4B4BXynxEuWbJkxIgRenp6PXr0MDAwqPmWmZmZUgoDaEitTKl9PdjkJ8IXp/kfrpFFvoyHOR4cAgAhdQVh//79s7OzR48e/fpbgoAP1KCuQuypi33ZX25IQ/ZzA5zpf3nRTQwRhwDaTn4QfvPNNyUlJUouBUAJJDSZ3Ib+yI3+7hLf9g/O14oa3ZyOcqQl6BwNoK3kB+Fnn32m5DoAlMlclyzwZf7Xgdn9ULosXTr2OD/Amf60Fe2J+6UA2ucdPgbzPF9aWqq4UgCUTJchA53pgz3ZS/1YF2MqKonvsJOLuyl9USV2ZQCgRK8EYceOHX/66SfZa0EQhg0bVrPv6ObNmw0NDZVaHYBSOBhSMV70vcHsXB8m+YngtKlqxFE++QkehwNohVeCMDs7u7i4WPZaEISNGzc+ePBAjKoAREBTJMSe2hLMXB8gaW1GTTjJt9rGzbsszSsXuzIAUCT0EACozUafxHjRtwex6wKZu8WC+9aqQYf45CfoMA2gmRCEAHXytqCWd2XuDZaE2FMx53jHjVzsOf7BCwQigEZBEAK8RSMdEt2CTuvDJoYzhBCfnVzofm7rPWkVJmwD0AgIQoD6am1GzfVhHg6VRLeg425KnTZVTU7lrz1DAxFAvVE1Z4pxcnJ69OhR9aYgCLUWIBQEQfkzy8TGxpqZmcXExNTn4KqqKp7nsaowKMGfhcKqW9K1t6UuJtRHzehBLrS5bu1jiouLjY2NxagOQA5ckHK9MqB+yJAheXl5YpUCoF7cG1HzOjLf+TBHMoW1t6X/d77Kz5oa6EwPdKb15c9UAQCq6JXf13nz5olVB4CaYigSYk+F2DNFVczO+9Kt96RTTvMRDvQINzrYHvPUAKgBfHAFaBgmEjLCjR7hRj8pEbbdE6af5QsqSH8HdkIbwa0REhFAdaGzDEADszekJrehL/Zl9/VgCCH+CVyHndyP16S5GJgPoJIQhACK0tqM+rcn93iYZK4Pk5YntNhaFZXEbb0nrcS4CwBVglujAIpV/RCxsJLZfl/68w3pxJP8IBf6w2Z0ZyvcMgUQH4IQQEka6ZBP3OlP3OmHL4Tf7wijj/ECIZ+40x+50Tb6YhcHoMVwaxRA2RyNqK/b0jcGsL8HMneKhJZbqzBVDYCI5AfhyZMnqxdgKisrmzZtmr+//5QpU7AeIUADks1l+nT4K1PVXC3AVDUASiU/CD/88MPz58/LXs+YMWPx4sUURcXHx48dO1aJtQFoBb2/1gc+Gsma6ZJeWB8YQLnkBOGLFy/u37/fpUsXQgjP82vWrPniiy9SUlK2bdu2efPmwsJCpRcJoBXcG1Gz2jPV6wPbb8DyTwDKIKezTFFRESGkcePGhJALFy7k5uYOHDiQENKtWzee5+/fv+/l5aXkKgG0B/1XL9OCCmbbPem0M3w5T0a50x+709boUwOgAHJahJaWljRN3759mxCybdu2Ro0aeXt7E0Jki9czDKPkEgG0k7kuiW5BX+rHxvszt4uEVtuqRqZgsQuAhienRSiRSCIiIsaNG9e/f/+4uLj+/ftLJBJCyJUrV2iadnR0VHqRAFrNz5rys2YWd2Lib0nDE/mmRiTGi45yRJdvgIYh/3dp+fLlbdq02bBhQ0BAwJw5c2Q7V69e3bZtWxMTEyWWBwAvGUvI5Db03cFsdAv6q7PSDju5tbelPNqHAP8Ypfz1Bd8V1iME9aWg5d+kAtn7SPrdJWleOfmsFT2uJa2HRxZQD1iPUK563V15+vRpUlLS06dPFV0NANQHTZEoRzq1N/ubP5OcKXXeVDXrAl9YKXZZAOpJfhAOGTJk5syZstfHjh1r1qxZjx49XFxcdu/ercTaAOAtutpQe8LYxJ7s3SLisrlqciqfWarq93gAVI2cIOQ4bufOnbJxhISQ2NhYV1fXlJSUwYMHT5o0ied55VYIAG/hZU6tDWTO9WGrpMRjOzftDF/KiV0TgPqQE4QFBQUVFRUuLi6EkNzc3DNnzsTExPj7+8+ZM+fBgwePHj1SepEA8HYuxtQvXZj0AZLsMtJ2B3cqG01DgHqRE4SywRJVVVWEkP379wuCEBwcTAgxNzcnhOTl5Sm3QgB4B1b6ZF0g80MnZvBhftwJNA0B3k5OEJqZmdnZ2cXHx7948WLlypVt27a1sbEhhDx48IAQYmlpqewaAeAdRThQV/uzZRzpsJM7l4umIcCbyO8sM3v27EWLFhkbG588efLrr7+W7dy7d6+FhQUG1AOoBVMdsjaQ+bc33TuJiz3HV+DhPkAd5C/M+8knn7Rv3/7ixYtt27Zt166dbKednd3SpUspCmtqA6iNgc50gA094STvvZNbE5CqzJYAAB6fSURBVMB4W+D3F6C2Oleob9u2bdu2bWvuGTx48DudOi8v7+bNm66urra2tm84LDMzk6KoNx8DAO/NSp9sD2G23pNGHuA+dqf/483oYHY2gBrq/IV48eJFXFzcxIkTo6KioqOjlyxZ8uzZs/qfd9u2bc2bN//3v//t4eHx66+/1nVYenq6q6vriBEj3q1qAHhHA53pS/0kN58Tn53cxXw8NQT4m/wp1h48eBAcHJyRkdGoUSMbG5vc3NyCggIbG5sDBw54enq+9aRVVVVOTk4rV66MiIi4ePGiv7//48ePGzVqVOswqVQaGBjYpEmT3NzcgwcP1nU2TLEG6ksFZ7Taek/62Sl+lDs925uRoGmoZVTwglQF8n8Pxo8fX1xcvG/fvmfPnt28eTM/P//EiRNGRkYfffRRfU568uRJQRB69uxJCGnXrl2zZs327t37+mGLFy/28fHp1KnTP/kBAOCdDHSmL/WVXH8mdN3D3XyOpiGAvGeEJSUlBw8e3LhxoyzJZLp06bJmzZouXbrcvXtXNtb+DR4+fOjk5FTdraZp06avD8O/d+9efHz8mTNn4uPj33y2kpKSvLy86iYjTdP+/v51LYso/cubzwmgHKp5NVrrkZ0h9Mpbgn8CN6UNPdWDYtCHRjuo5gWpUDT99vsecoKwsLCQ5/kWLVrU2t+yZUtCSH5+/luDsLy8XEdHp3pTV1e3tLS05gFSqXTUqFGLFi0yMjJ6a4lPnjy5cuXK3bt3q8/WrFmzxo0byz1YdmtU2/6nQWWVlZWp7FrWwx1IN3Nqwhl2xz2yvBPXzBitQ82nyhekgujp6bFsnd1CZeS8bWlpaWRktGfPHg8Pj5r79+zZQ9O0s7PzW7+xtbV1fn5+9WZ+fn6tTqFJSUl37969dOnSpUuXTp06df/+/YULF06dOlXu2dzd3X19ffGMENSRIAj1+bQnllZG5GhvsuKmNPQQPc2Dme5J02gaajQVvyDFIn+F+tGjR8+YMSMnJ2fgwIF2dna5ubkJCQkLFy4cMGCAhYXFW0/q7e19586dnJwcKyurioqKs2fPzps3r+YBzs7On376aYP9EADwvihColvQIfbUqBR+z0Pp6gCmmQnCELSL/F6jlZWV48ePX7NmTc17jH369FmzZk09V6gfNmxYfn7+pEmT1q5dm5ube/jwYUJIfHz8b7/9dvz48ZpHLlmyZM+ePeg1ChpJjTrpSQWy8pb0/87z37RlJrVBy1AzqdEFqUzy75zq6OjEx8fPmDHj5MmTz549MzEx6dSpk7u7e/3Pu2rVqkWLFq1evbp58+ZxcXGyna1atRowYECtIzt06GBgYPB+1QNAQ6EpEt2C7mpDfZzC738s3didNdMVuyYApZDTInz69Kmdnd3evXsjIiJEqakWtAhBfanjB3BOSr46yx95KhzsyVrgN0mzqOMFqQRy+pUaGxvTNI0HqgDaiaXJok7MEBe6WwKH9e5BG8gJQiMjo8jIyC1btii/GgBQETFe9MdudNBe/nEJshA0nPxnhCNHjpw4cWJ2dnZUVJStrW3NFSdCQkKUVRsAiCnGi2Zp0i2BPxTBuBij9wxoLPlB+Omnn+bk5Gzbtm3btm213pLbyxQANNJUD9qQJcH7+IM9MawCNJb8IExKSqqqqlJyKQCggsa3pBmKBCTwST2Z1mbIQtBA8oOwPktMAICWGNuCNpSQsP18YjjjYY4sBE3zSmeZ0tLSuLi406dPv37cjRs34uLicnJylFUYAKiQYa704k50yH7ufB4ejoCmeaVFuHTp0u++++7mzZuvH+fk5BQVFXXp0qVffvlFWbUBgAoZ5EIbsFTUAW5nKOtrhXYhaI5XWoRr164dM2ZMrQmyZQwNDadOnfr7779zHKes2gBAtfRypOL92Q8Ocqey0S4EzfF3EJaVlaWnpwcGBtZ1aGBgYHFx8e3bt5VRFwCopJ4O1KYgtl8ydzgTWQga4u8gLC0tFQThDbPvyOaaefHihTLqAgBVFWhLbQlmhx3hDj5BFoIm+DsITU1NJRJJRkZGXYfK3rKyslJGXQCgwvxtqO0h7PAj3O4HWAQb1N7fQcgwTJcuXVauXMnzvNxDly9f7uTk5OTkpKzaAEB1dbGm9oSxY0/wJ/G8ENTcK51lvvrqq9OnT48YMeL58+c195eVlcXExGzevHn69OnKLQ8AVJevFfV7IDv4MP8I85GCOntl+ETPnj1nz549Y8aMXbt2BQQEODs7Mwzz6NGjlJSUgoKCjz/+eOLEiWIVCgAqKNSe+rIN3TuJPxnFGsifnwNA1dW+cr/55pvOnTvPmzfv8OHD5eXlhBCWZTt27Dh58uRBgwaJUSEAqLSpHnT6c2FECr81mMHoQlBHcj7CBQcHBwcHV1ZW5uTkSKVSKysrrHMLAG/wkx8TuJebe1n6Ly85K7sBqLg672Xo6Og0adJEmaUAgJrSY8jOUNZ3F9fGjEQ5IgtBzeCSBYAGYKNPdoQwY47z156h4wyoGQQhADSM9hbUQl+mdxKfVy52KQDvAkEIAA3mw2Z0v6bU0CMch3H2oD4QhADQkOZ3ZHRp8tVZ+fNyAKggBCEANCSaIhuC2AOPhZW30CoE9YAgBIAGZiIhf4QyX5/jj2eh4wyoAQQhADS85o2odYHs0CP8Y8y+BioPQQgACtGjCTWpNd07iS/FYt6g2hCEAKAoX3nSbRtT0SfQcQZUGoIQABToly7M7ULh+yvoOAOqC0EIAAokm31t6XVpwkM8LAQVhSAEAMWyNSCbg5lRx7jrmH0NVBKCEAAUrrMVtdCX6ZfMP68UuxSA1yAIAUAZRrjREQ7UoEMcj2YhqBgEIQAoyQJfhqVJLGZfAxWDIAQAJWEosj6Q3fVQiP8TnUhBhSAIAUB5zHTJ7lDm63P8mRzcIQVVgSAEAKVqYUqt6MYMPMRnliILQSUgCAFA2aIc6egW9PAjvBRRCCoAQQgAIvi6LU1TZD5mnAEVgCAEABHQFFkbwPxwDQ8LQXwIQgAQh70htawL8+FRvrhK7FJAuyEIAUA0fZvSAbbUl6cxshDEhCAEADH92Jk5kSVsvouHhSAaBCEAiMmQJeu7M5NS+Ycv8LAQxIEgBACReVtQX7ZhPjzKYxpSEAWCEADE95UnLaEJ1u8FUSAIAUB8GE0BIkIQAoBKkI2mGI7RFKB0CEIAUBV9m9KBGE0BSocgBAAVgtEUoHwIQgBQIRhNAcqHIAQA1YLRFKBkCEIAUDkYTQHKhCAEAJWD0RSgTAhCAFBFGE0BSoMgBAAVhdEUoBwIQgBQXRhNAUqAIAQA1WXIkg0YTQEKhiAEAJXWHqMpQMEQhACg6r7ypFmKzL2MG6SgEAhCAFB1NEXWBTK/3JAefIJWITQ8BCEAqAF7Q2prMPPRUe5eMbIQGhiCEADUg5819ZUn0y+ZL+PELgU0C4IQANTGFA/aw4wadwIjC6EhIQgBQJ0s68pczBdW3ETHGWgwCgzCXbt2hYSEBAQErFy58vV3b9++PX369O7duwcHB8+dO7eiokJxlQCAxjBkyR8hzP+d509m42EhNAxWQedNS0sbOXLkmjVrTE1Nhw8fbmpqOmDAgJoHHDt2TFdX99tvv+V5furUqZmZmUuWLFFQMQCgSdwaUav8mWFH+PN9WEs9sasB9UcJgkI+VY0ePdrExGTx4sWEkF9++WXLli1Hjx6t6+AdO3ZMmzYtIyND7ruxsbFmZmYxMTH1+b5VVVU8z+vp4ZcDVEJxcbGxsbHYVWim/zvPp2YLST1ZFk946g0XpFyKuoIuX77s6+sre+3r63vp0qU3HHz16lVXV1cFVQIAGmm2N6PHkv87j44z8E8p6tZoTk6Oqamp7LW5uXlhYWF5ebnchtrFixcXLlx4+PDhuk517dq1U6dO/frrr7JNXV3dvXv3WllZyT1Y1iKsqsLCLaASXrx4IXYJmuzXDlRAkk5ro/K+Dug7Uy9aeEHq6elJJJI3H6OoIDQ2Ni4tLZW9fvHiha6urq6u7uuHpaenR0ZGrly50tvbu65TNW/evHXr1uPGjZNt0jTdtGnTug7GrVFQNbgTpTjGhOzsIYTtp7xt2dZmlNjlqAdckK9TVBA2bdq0+plfRkZG06ZNKar2ZXr79u2wsLD58+cPHDjwDaeSSCRmZmYuLi4KKhUA1JeXObXQl+mXzJ/9gG2kI3Y1oJ4U9Yxw2LBhq1evLikp4Xl+2bJlQ4cOle1funRpeno6IeTBgwdhYWHffPPNhx9+qKAaAEAbfNiM7m5LjUzB6hTwnhQVhIMHD/b29nZ2dnZychIEYcqUKbL9ixYtunHjBiFkxYoV9+/fHz9+PEVRFEUZGBgoqBIA0HhL/Zi8cuH7K3hSCO9DUcMnZPLz86uqqmxsbP7JSTB8AtQXeqsrzdNS4rOLW9WN6dEEDwvrhAtSLsUOwGncuPE/TEEAgPqwNSDrApkRKVieAt4ZRqICgIbobktN92T6Y3kKeEcIQgDQHFM9aLdG1OTTGGUP7wBBCACagyJkVTfmVLaw8hY6zkB9IQgBQKMYSV4uT3EuFw8LoV4QhACgadwbUSu6MgMP8bnlYpcC6gBBCAAaqLcTPdSVGnKYwzB7eCsEIQBopv91YHRo8g2Wp4C3QRACgGaiKbIukN2YIay7g44z8CaKmnQbAEB0FnpkXzgTvp8vriQTW+FzP8iHKwMANFkrU+p4FPPDdWnsOdwjBfkQhACg4ZyMqOO92MRHwqRUdJ0BORCEAKD5rPXJ0V7shTxh5FGewxNDeBWCEAC0gqkOSerJ5pQLAw7x5bhLCjUgCAFAWxiwZHcYq0OTiESuuErsakBlIAgBQIvo0GRjEONqQgXv4/IrxK4GVAOCEAC0C0ORuG5MgA0VkMBllqL3DCAIAUD7UIR878t81IzuuofPKEIWajsEIQBoqRgveronHbCXv1qALNRqmFkGALTXhJZ0Ix0Ssp/bFcp2sqLELgfEgRYhAGi1Ya70qm5s7yQu+QnahVoKQQgA2q6XI7UthB12hPvjPgbbayPcGgUAIP421L5wNuoAV8aR4c3QQtAu+P8GACCEkA4W1OFI9ts06aBD/B10JdUmCEIAgJdamlI3B7JdrCm/3dy4E3x2mdgFgVIgCAEA/qZDk8lt6JsDJWa6pPW2qthzPCZj03gIQgCA2sx1yVwf5kJf9lkFab2Ni7spxZoVGgxBCAAgn6MRtbwr80cosylD6vEHt/UewlAzIQgBAN5E1olmaWfmfxelfru5k9noR6NpEIQAAG8XYk9d6Mt+6UF/dJRHt1INgyAEAKgXmiIDnelr/VlvC3Qr1SgIQgCAd2DAkhgvOv2vbqWzLvBlnNg1wT+DIAQAeGeNdclcHyatL3u3iLhv5eJuSnncK1VbCEIAgPfkZEStDWS2BTPr70hbb+PmXpY+LkEeqh8EIQDAP+JrRaX0Yn8LYB68ENr+wYXu536/Iy3B/VL1gSAEAGgAna2oZV2YJ8Mkk1rTux8IduurBh3ik58IaCGqPgQhAECD0WVIlCO9JZi5N0QSYk/NusA7buRiz/G3CxGIqgtBCADQ8Mx1SXQL+kQUe6AnQwjxT+A67OR+vCbNKxe7MngNghAAQIFamVJzfZgnwyRzfZi0PMF9a1VUErf1nrQK87WpDCzMCwCgcDRFQuypEHvmeSWz+a70h2vSSaf4fs50R0vK24JqaUoxlNglajEEIQCA8pjqkHEt6HEt6NuFwu6HwoHHwneXpJmlgqc51b4x5W3xMhdZ3K1TIgQhAIAI3BpRUz1eNgOLq8jlfCEtTzicKXx/RXq3WHAxfhmK3haUjyWly4hbrIZDEAIAiMxYQrraUF1t/s7Fi/lCWp5wLk/49ab00QuhjfnLUPQ0p1yMKXNdcevVNAhCAADVYiwh/jaUf41cvJQvpOUJRzKFn29IM4oEihBXE8rFmHIxIS7GlKsJ5WJMHAxxQ/U9IQgBAFSasYR0s6G62fzdneZZBblbLNwtFu4WkbQ8Yes96d0iklkq2Bm8jMbqjGxmQjXSEbF29YAgBABQM2a6xFuX8rZ4padppZTcLxbuFpOMIuFusXA6h2QUS+8WCToMsdCjzHSIqQ4xZnQsDXkzHWKmS5npElPZCx1ipkvMdClTbY1MBCEAgCbQoYl7I8q9ESHklYDMKycFFcKzCvK8kmQ+ryhnJM8qSE6ZcKuQPKsgzyulzyrIs0ryrEIoqiRmusRUhzLTJWa6hBCiz1B6f/XTMdEhsjEeDEVM/opMuQe8k+IqwkmJlJDCSoEQUs4T2bJWRVWElxJOIMVVAiGklCMVPCGEPK8kk1rTk1o35F1gBCEAgCaz0CMWen89bmwkNTauM0KkwstEfF5JnlcQQkgZL5TzL98trCRSgRBCeIEUVb7cWcYLz/56ff/FywPeiZGESGhCEWKmQxFC9BiizxJCiImEMDRhKGIioQkh+iyRJa6pDrExaOBBlwhCAAAghBCaIo11SWPdmjGjFeP80ccIAAC0GoIQAAC0mqYF4a1bt86fPy92FQAv7dq1q7S0VOwqAAghpKioKCEhQewqVJGmBWFiYuKWLVvErgLgpTlz5ty9e1fsKgAIIeTmzZsLFy4UuwpVpGlBCAAA8E4QhAAAoNUQhAAAoNUoQXj3AZDK1bdv3/Pnz1tbW9fn4JycnMrKyiZNmii6KoD6uHHjhouLi56entiFAJDS0tKHDx+2aNFC7EKUaujQoVOnTn3zMWoQhPfu3Xv8+LGBgUF9Di4tLeU4zsTERNFVAdRHdna2lZUVRWnFqGRQcVKpNC8vz8rKSuxClMre3t7GxubNx6hBEAIAACgOnhECAIBWQxACAIBWQxACAIBW0/wgXLJkSVRU1OzZs3mef/vRAIq0fv36L7744qeffhK7EABy586dzz//vEePHuPGjXv48KHY5YhJw4NwzZo1qamp69atKygomDdvntjlgLbLzc21sbFJTEwUuxAAkp2dHRkZ+fvvv/v4+AwePFjscsSk4b1Gw8PDZ82a1alTp0ePHkVGRl65ckXsikDbnThxYu7cuZj7GFRHUVGRu7t7VlaW2IWIRsNbhI8fP5YNrre3t3/y5InY5QAAqJw5c+aMGjVK7CrEpOEr1EskEtmjQZ7nJRKJ2OUAAKiWX3/99erVqzt27BC7EDGpaxA+f/78woULd+7cCQwMdHd3r97/9OnTtWvXvnjxom/fvu3bt3dzc0tPT3dycrp161azZs1ELBg027Nnz86fP3/v3r3Q0FBnZ+fq/Y8ePVq3bl1FRUX//v09PT1FrBC0B8/z6enply5dMjQ07Nu3b/V+QRC2bduWlpbm5uY2YsQIiUQSHx+/Y8eOXbt2aXk7gZk1a5bYNbyPdu3aHTlyZPv27a1bt27btq1sZ0FBQfv27S0tLS0sLKKjo319fTt27Pjtt99aWVnNnj17/Pjxbdq0Ebds0FTu7u5nzpzZtGmTr69vy5YtZTuzsrLatWvn5ORkZGQ0duzYoKCg8+fPJycnp6WlMQxja2trbGwsbtmgkZYsWTJx4sS0tLTU1NTo6Ojq/bGxsStWrOjWrdumTZuSkpKMjIyio6PHjh1748aNtLS09u3ba+1cgOraWYbjOJZlO3XqNGHChJEjR8p2Lliw4ODBgwcOHCCELFq0KDExMSkp6cyZM0ePHvXx8QkKChK1ZNBksguydevWs2fP7tevn2znrFmzrl69un37dkLIf/7zn8uXL3/00Uc5OTmyd3v16mVnZydaxaC5ZFfj77///uOPP547d0628/nz5/b29mlpaS1atCgqKrKzs1u/fn12dnb1V40dO1Zrg1Bdb42yrJzKjx49Gh4eLnsdHh7+r3/9SyqV+vr6+vr6Krc60DpyL8gjR458+OGHstfh4eE//PCDLBQBFEru1XjmzBkrKyvZ0hMmJiadOnXKzMycMGGC0qtTRRrVazQrK8vS0lL22traurKyMi8vT9ySQJvVuiCfPXtWVlYmbkmgtZ4+fVpz3Qlra+unT5+KWI9K0aggZBhGKpXKXnMcRwjR8ifAIK5aFyRFUQzDiFsSaC2WZauvRkIIz/NyG47aSaOC0M7OLjMzU/Y6MzNTX1/f1NRU3JJAm9W6IC0tLXV0dMQtCbSWra1t9dVICHny5AkeUVfTqCCMjIzcsWOH7FPP9u3bIyMjtfbZL6iCyMjI7du3y/qjyS5IsSsC7eXn51deXn7q1ClCyJMnT9LS0qp7VIC69hqdMWNGamrquXPnmjRpYmtr+9133/n4+JSWlnbr1q1Ro0aOjo579+49dOgQRm6BckybNu3y5cupqamurq5WVlY//PBD69atCwsL/fz8HBwcLCwsDh48ePz48ZpjXgEU5MKFCzExMVlZWQ8fPuzYsaOvr+9///tfQsjPP//8v//9r0+fPgcPHuzTp8/3338vdqWqQl2D8PLly7m5udWb7dq1a9y4MSGkoqIiMTGxuLg4NDTU2tpavAJBu6SlpT179qx6s0OHDrLb8qWlpQcOHCgrKwsLC7OwsBCvQNAiBQUFFy5cqN60sLCoHmx95cqVtLS05s2b+/n5iVSdKlLXIAQAAGgQGvWMEAAA4F0hCAEAQKshCAEAQKshCAEAQKshCAEAQKshCAEAQKshCAEAQKshCAE039ChQ0eNGiV2FQAqCrOPA2i+p0+f6uvri10FgIpCixAAALQaWoQAypabm7t58+aMjAxTU9OoqKj27dvL9vM8v2rVqs6dOxsZGW3cuDE/P9/Hx2fQoEE0/fcH1tLS0k2bNl27ds3AwCAkJCQwMLDmmXmeT0hIOHv2bHl5ebNmzXr16uXg4FD9bnl5+fr169PT052dnQcOHFhzmdZr164lJCTk5OSYmJh4enqGhYUZGRkp9l8BQGVgrlEApTp8+HC/fv309fW9vb0fP3585cqV77//furUqYSQiooKPT29wYMHHzx4sG3bthUVFadOnerVq9eOHTtkK/o+fPiwe/fuWVlZfn5+ubm5ly9fHjVq1KpVq2TLjeXm5kZERFy8eLFdu3a2trbXr19v0qRJSkoKISQwMFAqlZaVlZWUlFhbW585c6Zx48bXr183MTEhhMTHx48ZM8bT09Pd3T0vLy8tLW3NmjV9+vQR9d8JQIkEAFCWgoICc3PzXr16lZSUyPbMnDmTYZjr168LglBeXk4IoWn68OHDsnfj4+MJIStXrpRtRkREmJiYXLlyRbYpW1tnw4YNsk1ZvqakpFR/uz///FP2IiAggBAyf/582eaZM2coilqwYIFss3nz5iNHjqz+qtLS0vz8/Ib/4QFUFYIQQHl+/fVXQsjt27er91RVVRkYGCxatEj4Kwh79epV/a5UKm3VqlWPHj0EQSguLqYoaurUqdXvVlZW2tvbh4eHC4KQnZ1d692aAgICHB0deZ6v3tOqVavq8HN0dOzXr19paWkD/qQAagTPCAGU58qVKzRNf/nll7LMkxEE4c6dO9Wb7dq1q35NUVTbtm3PnDlDCMnIyBAEofqBIiFEIpF4eXnduHGDEHLjxg1BEDp16lTXt3Zzc6v5rNHCwqJ6Rc/p06dPnjzZ2to6MjIyLCysT58+ZmZmDfDTAqgJBCGA8lRWVrIs27Vr15o7Q0JCPDw8qjdZ9pXfSh0dnYqKCkJIYWGhbLPWuzzPE0I4jnv93ZokEknNTdljRZnPPvvM399/27Zthw4dio6O/uqrr/bt2+fj4/PuPx+AWkIQAiiPq6trZWXlkCFDnJyc6jrm9u3bNTdv3brl6upKCGnatKlss+a76enpsv3NmjUjhFy7dq13797vUZinp6enp+d//vOf+/fvd+rUac6cOX/88cd7nAdAHWEcIYDyDBo0SCKRxMbGyhpwMiUlJc+ePave3L59+/3792WvT506dfr06bCwMEKIo6Ojt7d3XFxcQUGB7N0dO3bcunWrX79+hJCmTZv6+fn9+OOPjx8/rj6VrCn5ZlKpNDMzs3rTycnJ1ta2ZnkAGg8tQgDlcXFxWbZs2fjx469evRoeHq6np3fnzp3ExMRNmzaFh4fLjmnfvr2fn9+wYcMqKirWrFnTqlWryZMny9765ZdfgoODfXx8BgwYkJOTs379+s6dO0+YMEH27sqVK7t37+7l5TVgwABbW9sbN24UFhYeOHDgzSVxHOfk5BQWFubh4WFoaHj8+PGrV6/OmTNHcf8IAKoG4wgBlO3y5curVq26fv26RCJxcHAIDQ3t1auXgYGBbBzhwoULW7ZsGR8fn5+f37Fjx5iYmJpdV27fvr106dKrV68aGhoGBwePHz++5txp2dnZP//889mzZ3med3Z2HjZsmGzE/fLlyyUSySeffFJ95PLly3V0dEaNGiUIwsaNG48dO/bw4UOO45o1azZmzJiaXXIANB6CEEBVVAfhlClTxK4FQIvgGSEAAGg1BCGACjEzM9PT0xO7CgDtglujAACg1dAiBAAArYYgBAAArfb/ks8yH29TyCcAAAAASUVORK5CYII=", + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ] + }, + "metadata": {}, + "execution_count": 10 + } + ], + "cell_type": "code", + "source": [ + "plot(\n", + " curve.parameter_values,\n", + " curve.measurements,\n", + " xlab=curve.parameter_name,\n", + " xscale=curve.parameter_scale,\n", + " ylab = \"Cross Entropy\",\n", + ")" + ], + "metadata": {}, + "execution_count": 10 + }, + { + "cell_type": "markdown", + "source": [ + "---\n", + "\n", + "*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*" + ], + "metadata": {} + } + ], + "nbformat_minor": 3, + "metadata": { + "language_info": { + "file_extension": ".jl", + "mimetype": "application/julia", + "name": "julia", + "version": "1.10.3" + }, + "kernelspec": { + "name": "julia-1.10", + "display_name": "Julia 1.10.3", + "language": "julia" + } + }, + "nbformat": 4 +} diff --git a/dev/common_workflows/hyperparameter_tuning/notebook.jl b/dev/common_workflows/hyperparameter_tuning/notebook.jl index aa39830d..3c85ec16 100644 --- a/dev/common_workflows/hyperparameter_tuning/notebook.jl +++ b/dev/common_workflows/hyperparameter_tuning/notebook.jl @@ -24,7 +24,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); X = Float32.(X); # To be compatible with type of network network parameters diff --git a/dev/common_workflows/hyperparameter_tuning/notebook.unexecuted.ipynb b/dev/common_workflows/hyperparameter_tuning/notebook.unexecuted.ipynb index 2060f391..bbb6280a 100644 --- a/dev/common_workflows/hyperparameter_tuning/notebook.unexecuted.ipynb +++ b/dev/common_workflows/hyperparameter_tuning/notebook.unexecuted.ipynb @@ -73,7 +73,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, diff --git a/dev/common_workflows/hyperparameter_tuning/notebook/c6097af1.svg b/dev/common_workflows/hyperparameter_tuning/notebook/796cf2ee.svg similarity index 85% rename from dev/common_workflows/hyperparameter_tuning/notebook/c6097af1.svg rename to dev/common_workflows/hyperparameter_tuning/notebook/796cf2ee.svg index 46e3b3a3..8a831710 100644 --- a/dev/common_workflows/hyperparameter_tuning/notebook/c6097af1.svg +++ b/dev/common_workflows/hyperparameter_tuning/notebook/796cf2ee.svg @@ -1,40 +1,40 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/common_workflows/hyperparameter_tuning/notebook/index.html b/dev/common_workflows/hyperparameter_tuning/notebook/index.html index a78c0afa..ddffcda6 100644 --- a/dev/common_workflows/hyperparameter_tuning/notebook/index.html +++ b/dev/common_workflows/hyperparameter_tuning/notebook/index.html @@ -4,7 +4,7 @@ import RDatasets # Dataset source using Plots # To plot tuning results import Optimisers # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);
+y, X = unpack(iris, ==(:Species), rng=123);
 X = Float32.(X);      # To be compatible with type of network network parameters

Instantiating the model

Now let's construct our model. This follows a similar setup the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
 clf = NeuralNetworkClassifier(
     builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),
@@ -65,4 +65,4 @@
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "Cross Entropy",
-)
Example block output

This page was generated using Literate.jl.

+)Example block output

This page was generated using Literate.jl.

diff --git a/dev/common_workflows/incremental_training/README/index.html b/dev/common_workflows/incremental_training/README/index.html index b06ae5d4..0b07716d 100644 --- a/dev/common_workflows/incremental_training/README/index.html +++ b/dev/common_workflows/incremental_training/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/incremental_training/notebook.ipynb b/dev/common_workflows/incremental_training/notebook.ipynb index b85e848b..e3b44f52 100644 --- a/dev/common_workflows/incremental_training/notebook.ipynb +++ b/dev/common_workflows/incremental_training/notebook.ipynb @@ -7,6 +7,14 @@ ], "metadata": {} }, + { + "cell_type": "markdown", + "source": [ + "This demonstration is available as a Jupyter notebook or julia script\n", + "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/incremental_training)." + ], + "metadata": {} + }, { "cell_type": "markdown", "source": [ @@ -36,9 +44,7 @@ { "cell_type": "markdown", "source": [ - "**Julia version** is assumed to be 1.10.* This tutorial is available as a Jupyter\n", - "notebook or julia script\n", - "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/incremental_training)." + "**Julia version** is assumed to be 1.10.*" ], "metadata": {} }, @@ -73,7 +79,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X) # To be compatible with type of network network parameters\n", "(X_train, X_test), (y_train, y_test) = partition(\n", " (X, y), 0.8,\n", @@ -113,7 +119,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = ComputationalResources.CPU1{Nothing}(nothing))" + "text/plain": "NeuralNetworkClassifier(\n builder = MLP(\n hidden = (5, 4), \n σ = NNlib.relu), \n finaliser = NNlib.softmax, \n optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), \n loss = Flux.Losses.crossentropy, \n epochs = 10, \n batch_size = 8, \n lambda = 0.0, \n alpha = 0.0, \n rng = 42, \n optimiser_changes_trigger_retraining = false, \n acceleration = CPU1{Nothing}(nothing))" }, "metadata": {}, "execution_count": 4 @@ -161,7 +167,7 @@ { "output_type": "execute_result", "data": { - "text/plain": "trained Machine; caches model-specific representations of data\n model: NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)\n args: \n 1:\tSource @068 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}\n 2:\tSource @767 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}}\n" + "text/plain": "trained Machine; caches model-specific representations of data\n model: NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)\n args: \n 1:\tSource @547 ⏎ Table{AbstractVector{Continuous}}\n 2:\tSource @645 ⏎ AbstractVector{Multiclass{3}}\n" }, "metadata": {}, "execution_count": 5 diff --git a/dev/common_workflows/incremental_training/notebook.jl b/dev/common_workflows/incremental_training/notebook.jl index 20d38b53..6d44c046 100644 --- a/dev/common_workflows/incremental_training/notebook.jl +++ b/dev/common_workflows/incremental_training/notebook.jl @@ -22,7 +22,7 @@ import Optimisers # native Flux.jl optimisers no longer supported # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); X = Float32.(X) # To be compatible with type of network network parameters (X_train, X_test), (y_train, y_test) = partition( (X, y), 0.8, diff --git a/dev/common_workflows/incremental_training/notebook.unexecuted.ipynb b/dev/common_workflows/incremental_training/notebook.unexecuted.ipynb index 4d12d4d7..b9227430 100644 --- a/dev/common_workflows/incremental_training/notebook.unexecuted.ipynb +++ b/dev/common_workflows/incremental_training/notebook.unexecuted.ipynb @@ -7,6 +7,14 @@ ], "metadata": {} }, + { + "cell_type": "markdown", + "source": [ + "This demonstration is available as a Jupyter notebook or julia script\n", + "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/incremental_training)." + ], + "metadata": {} + }, { "cell_type": "markdown", "source": [ @@ -28,9 +36,7 @@ { "cell_type": "markdown", "source": [ - "**Julia version** is assumed to be 1.10.* This tutorial is available as a Jupyter\n", - "notebook or julia script\n", - "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/incremental_training)." + "**Julia version** is assumed to be 1.10.*" ], "metadata": {} }, @@ -65,7 +71,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X) # To be compatible with type of network network parameters\n", "(X_train, X_test), (y_train, y_test) = partition(\n", " (X, y), 0.8,\n", diff --git a/dev/common_workflows/incremental_training/notebook/index.html b/dev/common_workflows/incremental_training/notebook/index.html index 4c9952bf..65797851 100644 --- a/dev/common_workflows/incremental_training/notebook/index.html +++ b/dev/common_workflows/incremental_training/notebook/index.html @@ -1,9 +1,9 @@ -Incremental Training · MLJFlux

Incremental Training with MLJFlux

In this workflow example we explore how to incrementally train MLJFlux models.

Julia version is assumed to be 1.10.* This tutorial is available as a Jupyter notebook or julia script here.

Basic Imports

using MLJ               # Has MLJFlux models
+Incremental Training · MLJFlux

Incremental Training with MLJFlux

This demonstration is available as a Jupyter notebook or julia script here.

In this workflow example we explore how to incrementally train MLJFlux models.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
 using Flux              # For more flexibility
 import RDatasets        # Dataset source
 import Optimisers       # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);
+y, X = unpack(iris, ==(:Species), rng=123);
 X = Float32.(X)      # To be compatible with type of network network parameters
 (X_train, X_test), (y_train, y_test) = partition(
     (X, y), 0.8,
@@ -34,8 +34,8 @@
 fit!(mach)
trained Machine; caches model-specific representations of data
   model: NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)
   args: 
-    1:	Source @376 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
-    2:	Source @952 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}}
+    1:	Source @920 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
+    2:	Source @818 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}}
 

Let's evaluate the training loss and validation accuracy

training_loss = cross_entropy(predict(mach, X_train), y_train)
0.4392339631006042
val_acc = accuracy(predict_mode(mach, X_test), y_test)
0.9

Poor performance it seems.

Incremental Training

Now let's train it for another 30 epochs at half the original learning rate. All we need to do is changes these hyperparameters and call fit again. It won't reset the model parameters before training.

clf.optimiser = Optimisers.Adam(clf.optimiser.eta/2)
 clf.epochs = clf.epochs + 30
 fit!(mach, verbosity=2);
[ Info: Updating machine(NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …), …).
@@ -68,4 +68,4 @@
 [ Info: Loss is 0.1353
 [ Info: Loss is 0.1251
 [ Info: Loss is 0.1173
-[ Info: Loss is 0.1102

Let's evaluate the training loss and validation accuracy

training_loss = cross_entropy(predict(mach, X_train), y_train)
0.10519664737051289
training_acc = accuracy(predict_mode(mach, X_test), y_test)
0.9666666666666667

That's much better. If we are rather interested in resetting the model parameters before fitting, we can do fit(mach, force=true).


This page was generated using Literate.jl.

+[ Info: Loss is 0.1102

Let's evaluate the training loss and validation accuracy

training_loss = cross_entropy(predict(mach, X_train), y_train)
0.10519664737051289
training_acc = accuracy(predict_mode(mach, X_test), y_test)
0.9666666666666667

That's much better. If we are rather interested in resetting the model parameters before fitting, we can do fit(mach, force=true).


This page was generated using Literate.jl.

diff --git a/dev/common_workflows/live_training/README/index.html b/dev/common_workflows/live_training/README/index.html index c1f14a1a..66f190e7 100644 --- a/dev/common_workflows/live_training/README/index.html +++ b/dev/common_workflows/live_training/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/common_workflows/live_training/notebook.jl b/dev/common_workflows/live_training/notebook.jl index 16bae98a..de1a6fb8 100644 --- a/dev/common_workflows/live_training/notebook.jl +++ b/dev/common_workflows/live_training/notebook.jl @@ -23,7 +23,7 @@ using Plots # ### Loading and Splitting the Data iris = RDatasets.dataset("datasets", "iris"); -y, X = unpack(iris, ==(:Species), colname -> true, rng=123); +y, X = unpack(iris, ==(:Species), rng=123); X = Float32.(X); # To be compatible with type of network network parameters diff --git a/dev/common_workflows/live_training/notebook.unexecuted.ipynb b/dev/common_workflows/live_training/notebook.unexecuted.ipynb index a647a39a..fb86f8e7 100644 --- a/dev/common_workflows/live_training/notebook.unexecuted.ipynb +++ b/dev/common_workflows/live_training/notebook.unexecuted.ipynb @@ -10,7 +10,7 @@ { "cell_type": "markdown", "source": [ - "This tutorial is available as a Jupyter notebook or julia script\n", + "This demonstration is available as a Jupyter notebook or julia script\n", "[here](https://github.com/FluxML/MLJFlux.jl/tree/dev/docs/src/common_workflows/live_training)." ], "metadata": {} @@ -73,7 +73,7 @@ "cell_type": "code", "source": [ "iris = RDatasets.dataset(\"datasets\", \"iris\");\n", - "y, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n", + "y, X = unpack(iris, ==(:Species), rng=123);\n", "X = Float32.(X); # To be compatible with type of network network parameters" ], "metadata": {}, diff --git a/dev/common_workflows/live_training/notebook/index.html b/dev/common_workflows/live_training/notebook/index.html index fb251783..7f2c1194 100644 --- a/dev/common_workflows/live_training/notebook/index.html +++ b/dev/common_workflows/live_training/notebook/index.html @@ -1,9 +1,9 @@ -Live Training · MLJFlux

Live Training with MLJFlux

This tutorial is available as a Jupyter notebook or julia script here.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ
+Live Training · MLJFlux

Live Training with MLJFlux

This demonstration is available as a Jupyter notebook or julia script here.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ
 using Flux
 import RDatasets
 import Optimisers
using Plots

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
-y, X = unpack(iris, ==(:Species), colname -> true, rng=123);
+y, X = unpack(iris, ==(:Species), rng=123);
 X = Float32.(X);      # To be compatible with type of network network parameters

Instantiating the model

Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
 
 clf = NeuralNetworkClassifier(
@@ -76,6 +76,6 @@
 fit!(mach, force=true)
trained Machine; does not cache data
   model: ProbabilisticIteratedModel(model = NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …), …)
   args: 
-    1:	Source @782 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
-    2:	Source @834 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}}
-

This page was generated using Literate.jl.

+ 1: Source @384 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}} + 2: Source @782 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}} +

This page was generated using Literate.jl.

diff --git a/dev/contributing/index.html b/dev/contributing/index.html index 4724dc75..b5819d7f 100644 --- a/dev/contributing/index.html +++ b/dev/contributing/index.html @@ -1,2 +1,2 @@ -Contributing · MLJFlux

Adding new models to MLJFlux

This section assumes familiarity with the MLJ model API

If one subtypes a new model type as either MLJFlux.MLJFluxProbabilistic or MLJFlux.MLJFluxDeterministic, then instead of defining new methods for MLJModelInterface.fit and MLJModelInterface.update one can make use of fallbacks by implementing the lower level methods shape, build, and fitresult. See the classifier source code for an example.

One still needs to implement a new predict method.

+Contributing · MLJFlux

Adding new models to MLJFlux

This section assumes familiarity with the MLJ model API

If one subtypes a new model type as either MLJFlux.MLJFluxProbabilistic or MLJFlux.MLJFluxDeterministic, then instead of defining new methods for MLJModelInterface.fit and MLJModelInterface.update one can make use of fallbacks by implementing the lower level methods shape, build, and fitresult. See the classifier source code for an example.

One still needs to implement a new predict method.

diff --git a/dev/extended_examples/Boston/index.html b/dev/extended_examples/Boston/index.html index 6518c6b5..5b330838 100644 --- a/dev/extended_examples/Boston/index.html +++ b/dev/extended_examples/Boston/index.html @@ -1,2 +1,2 @@ -- · MLJFlux
+- · MLJFlux
diff --git a/dev/extended_examples/MNIST/README/index.html b/dev/extended_examples/MNIST/README/index.html index ec354ac5..08fbd9a7 100644 --- a/dev/extended_examples/MNIST/README/index.html +++ b/dev/extended_examples/MNIST/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/extended_examples/MNIST/notebook/0555555c.svg b/dev/extended_examples/MNIST/notebook/1bde4ba3.svg similarity index 85% rename from dev/extended_examples/MNIST/notebook/0555555c.svg rename to dev/extended_examples/MNIST/notebook/1bde4ba3.svg index 43d90fa8..c6836edf 100644 --- a/dev/extended_examples/MNIST/notebook/0555555c.svg +++ b/dev/extended_examples/MNIST/notebook/1bde4ba3.svg @@ -1,62 +1,62 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/extended_examples/MNIST/notebook/c1bb4258.svg b/dev/extended_examples/MNIST/notebook/a7a9f554.svg similarity index 85% rename from dev/extended_examples/MNIST/notebook/c1bb4258.svg rename to dev/extended_examples/MNIST/notebook/a7a9f554.svg index eeccfce4..b85ee2eb 100644 --- a/dev/extended_examples/MNIST/notebook/c1bb4258.svg +++ b/dev/extended_examples/MNIST/notebook/a7a9f554.svg @@ -1,50 +1,50 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/extended_examples/MNIST/notebook/index.html b/dev/extended_examples/MNIST/notebook/index.html index 4f4ba3ed..1a9b89b9 100644 --- a/dev/extended_examples/MNIST/notebook/index.html +++ b/dev/extended_examples/MNIST/notebook/index.html @@ -78,7 +78,7 @@ 0.055122323 0.057923194

Adding 20 more epochs:

clf.epochs = clf.epochs + 20
 fit!(mach, rows=train);
[ Info: Updating machine(ImageClassifier(builder = Main.MyConvBuilder(3, 16, 32, 32), …), …).
-
Optimising neural net:  10%[==>                      ]  ETA: 0:00:11
Optimising neural net:  14%[===>                     ]  ETA: 0:00:12
Optimising neural net:  19%[====>                    ]  ETA: 0:00:12
Optimising neural net:  24%[=====>                   ]  ETA: 0:00:12
Optimising neural net:  29%[=======>                 ]  ETA: 0:00:11
Optimising neural net:  33%[========>                ]  ETA: 0:00:11
Optimising neural net:  38%[=========>               ]  ETA: 0:00:10
Optimising neural net:  43%[==========>              ]  ETA: 0:00:09
Optimising neural net:  48%[===========>             ]  ETA: 0:00:09
Optimising neural net:  52%[=============>           ]  ETA: 0:00:08
Optimising neural net:  57%[==============>          ]  ETA: 0:00:07
Optimising neural net:  62%[===============>         ]  ETA: 0:00:06
Optimising neural net:  67%[================>        ]  ETA: 0:00:06
Optimising neural net:  71%[=================>       ]  ETA: 0:00:05
Optimising neural net:  76%[===================>     ]  ETA: 0:00:04
Optimising neural net:  81%[====================>    ]  ETA: 0:00:03
Optimising neural net:  86%[=====================>   ]  ETA: 0:00:02
Optimising neural net:  90%[======================>  ]  ETA: 0:00:02
Optimising neural net:  95%[=======================> ]  ETA: 0:00:01
Optimising neural net: 100%[=========================] Time: 0:00:17

Computing an out-of-sample estimate of the loss:

predicted_labels = predict(mach, rows=test);
+
Optimising neural net:  10%[==>                      ]  ETA: 0:00:06
Optimising neural net:  14%[===>                     ]  ETA: 0:00:08
Optimising neural net:  19%[====>                    ]  ETA: 0:00:08
Optimising neural net:  24%[=====>                   ]  ETA: 0:00:08
Optimising neural net:  29%[=======>                 ]  ETA: 0:00:07
Optimising neural net:  33%[========>                ]  ETA: 0:00:07
Optimising neural net:  38%[=========>               ]  ETA: 0:00:06
Optimising neural net:  43%[==========>              ]  ETA: 0:00:06
Optimising neural net:  48%[===========>             ]  ETA: 0:00:05
Optimising neural net:  52%[=============>           ]  ETA: 0:00:05
Optimising neural net:  57%[==============>          ]  ETA: 0:00:04
Optimising neural net:  62%[===============>         ]  ETA: 0:00:04
Optimising neural net:  67%[================>        ]  ETA: 0:00:03
Optimising neural net:  71%[=================>       ]  ETA: 0:00:03
Optimising neural net:  76%[===================>     ]  ETA: 0:00:02
Optimising neural net:  81%[====================>    ]  ETA: 0:00:02
Optimising neural net:  86%[=====================>   ]  ETA: 0:00:01
Optimising neural net:  90%[======================>  ]  ETA: 0:00:01
Optimising neural net:  95%[=======================> ]  ETA: 0:00:01
Optimising neural net: 100%[=========================] Time: 0:00:10

Computing an out-of-sample estimate of the loss:

predicted_labels = predict(mach, rows=test);
 cross_entropy(predicted_labels, labels[test])
0.4883231265583621

Or to fit and predict, in one line:

evaluate!(mach,
           resampling=Holdout(fraction_train=0.5),
           measure=cross_entropy,
@@ -183,7 +183,7 @@
     parameter_means2,
     title="Flux parameter mean weights",
     xlab = "epoch",
-)
Example block output

Note. The higher the number in the plot legend, the deeper the layer we are **weight-averaging.

savefig(joinpath(tempdir(), "weights.png"))
"/tmp/weights.png"

Retrieving a snapshot for a prediction:

mach2 = machine(joinpath(tempdir(), "mnist3.jls"))
+)
Example block output

Note. The higher the number in the plot legend, the deeper the layer we are **weight-averaging.

savefig(joinpath(tempdir(), "weights.png"))
"/tmp/weights.png"

Retrieving a snapshot for a prediction:

mach2 = machine(joinpath(tempdir(), "mnist3.jls"))
 predict_mode(mach2, images[501:503])
3-element CategoricalArrays.CategoricalArray{Int64,1,UInt32}:
  7
  9
@@ -197,4 +197,4 @@
     ylab = "cross entropy",
     label="out-of-sample",
 )
-plot!(epochs, training_losses, label="training")
Example block output

This page was generated using Literate.jl.

+plot!(epochs, training_losses, label="training")Example block output

This page was generated using Literate.jl.

diff --git a/dev/extended_examples/spam_detection/README/index.html b/dev/extended_examples/spam_detection/README/index.html index e874339e..7a4eb351 100644 --- a/dev/extended_examples/spam_detection/README/index.html +++ b/dev/extended_examples/spam_detection/README/index.html @@ -1,2 +1,2 @@ -Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

+Contents · MLJFlux

Contents

filedescription
notebook.ipynbJuptyer notebook (executed)
notebook.unexecuted.ipynbJupyter notebook (unexecuted)
notebook.mdstatic markdown (included in MLJFlux.jl docs)
notebook.jlexecutable Julia script annotated with comments
generate.jlmaintainers only: execute to generate first 3 from 4th

Important

Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.

diff --git a/dev/extended_examples/spam_detection/notebook/index.html b/dev/extended_examples/spam_detection/notebook/index.html index bf219d3f..eb099869 100644 --- a/dev/extended_examples/spam_detection/notebook/index.html +++ b/dev/extended_examples/spam_detection/notebook/index.html @@ -92,13 +92,13 @@ mach = machine(clf, x_train_processed_equalized_fixed, y_train)
untrained Machine; caches model-specific representations of data
   model: NeuralNetworkClassifier(builder = GenericBuilder(apply = #15), …)
   args: 
-    1:	Source @416 ⏎ AbstractMatrix{ScientificTypesBase.Continuous}
-    2:	Source @123 ⏎ AbstractVector{ScientificTypesBase.Multiclass{2}}
+    1:	Source @940 ⏎ AbstractMatrix{ScientificTypesBase.Continuous}
+    2:	Source @176 ⏎ AbstractVector{ScientificTypesBase.Multiclass{2}}
 

Train the Model

fit!(mach)
trained Machine; caches model-specific representations of data
   model: NeuralNetworkClassifier(builder = GenericBuilder(apply = #15), …)
   args: 
-    1:	Source @416 ⏎ AbstractMatrix{ScientificTypesBase.Continuous}
-    2:	Source @123 ⏎ AbstractVector{ScientificTypesBase.Multiclass{2}}
+    1:	Source @940 ⏎ AbstractMatrix{ScientificTypesBase.Continuous}
+    2:	Source @176 ⏎ AbstractVector{ScientificTypesBase.Multiclass{2}}
 

Evaluate the Model

ŷ = predict_mode(mach, x_val_processed_equalized_fixed)
 balanced_accuracy(ŷ, y_val)
0.8840999384477648

Acceptable performance. Let's see some live examples:

using Random: Random;
 Random.seed!(99);
@@ -111,4 +111,4 @@
 z_encoded_equalized_fixed = coerce(z_encoded_equalized_fixed, Continuous)
 z_pred = predict_mode(mach, z_encoded_equalized_fixed)
 
-print("SMS: `$(z)` and the prediction is `$(z_pred)`")
SMS: `Hi elaine, is today's meeting confirmed?` and the prediction is `CategoricalArrays.CategoricalValue{InlineStrings.String7, UInt32}[InlineStrings.String7("ham")]`

This page was generated using Literate.jl.

+print("SMS: `$(z)` and the prediction is `$(z_pred)`")
SMS: `Hi elaine, is today's meeting confirmed?` and the prediction is `CategoricalArrays.CategoricalValue{InlineStrings.String7, UInt32}[InlineStrings.String7("ham")]`

This page was generated using Literate.jl.

diff --git a/dev/index.html b/dev/index.html index 013653c3..7dad58f3 100644 --- a/dev/index.html +++ b/dev/index.html @@ -40,4 +40,4 @@ ├─────────────────────────────┼─────────┤ │ [1.0, 1.0, 0.967, 0.9, 1.0] │ 0.0426 │ └─────────────────────────────┴─────────┘ -

As you can see we are able to use MLJ meta-functionality (i.e., cross validation) with a Flux deep learning model. All arguments provided have defaults.

Notice that we are also able to define the neural network in a high-level fashion by only specifying the number of neurons in each hidden layer and the activation function. Meanwhile, MLJFlux is able to infer the input and output layer as well as use a suitable default for the loss function and output activation given the classification task. Notice as well that we did not need to manually implement a training or prediction loop.

Basic idea: "builders" for data-dependent architecture

As in the example above, any MLJFlux model has a builder hyperparameter, an object encoding instructions for creating a neural network given the data that the model eventually sees (e.g., the number of classes in a classification problem). While each MLJ model has a simple default builder, users may need to define custom builders to get optimal results (see Defining Custom Builders and this will require familiarity with the Flux API for defining a neural network chain.

Flux or MLJFlux?

Flux is a deep learning framework in Julia that comes with everything you need to build deep learning models (i.e., GPU support, automatic differentiation, layers, activations, losses, optimizers, etc.). MLJFlux wraps models built with Flux which provides a more high-level interface for building and training such models. More importantly, it empowers Flux models by extending their support to many common machine learning workflows that are possible via MLJ such as:

  • Estimating performance of your model using a holdout set or other resampling strategy (e.g., cross-validation) as measured by one or more metrics (e.g., loss functions) that may not have been used in training

  • Optimizing hyper-parameters such as a regularization parameter (e.g., dropout) or a width/height/nchannnels of convolution layer

  • Compose with other models such as introducing data pre-processing steps (e.g., missing data imputation) into a pipeline. It might make sense to include non-deep learning models in this pipeline. Other kinds of model composition could include blending predictions of a deep learner with some other kind of model (as in “model stacking”). Models composed with MLJ can be also tuned as a single unit.

  • Controlling iteration by adding an early stopping criterion based on an out-of-sample estimate of the loss, dynamically changing the learning rate (eg, cyclic learning rates), periodically save snapshots of the model, generate live plots of sample weights to judge training progress (as in tensor board)

  • Comparing your model with a non-deep learning models

A comparable project, FastAI/FluxTraining, also provides a high-level interface for interacting with Flux models and supports a set of features that may overlap with (but not include all of) those supported by MLJFlux.

Many of the features mentioned above are showcased in the workflow examples that you can access from the sidebar.

+

As you can see we are able to use MLJ meta-functionality (i.e., cross validation) with a Flux deep learning model. All arguments provided have defaults.

Notice that we are also able to define the neural network in a high-level fashion by only specifying the number of neurons in each hidden layer and the activation function. Meanwhile, MLJFlux is able to infer the input and output layer as well as use a suitable default for the loss function and output activation given the classification task. Notice as well that we did not need to manually implement a training or prediction loop.

Basic idea: "builders" for data-dependent architecture

As in the example above, any MLJFlux model has a builder hyperparameter, an object encoding instructions for creating a neural network given the data that the model eventually sees (e.g., the number of classes in a classification problem). While each MLJ model has a simple default builder, users may need to define custom builders to get optimal results (see Defining Custom Builders and this will require familiarity with the Flux API for defining a neural network chain.

Flux or MLJFlux?

Flux is a deep learning framework in Julia that comes with everything you need to build deep learning models (i.e., GPU support, automatic differentiation, layers, activations, losses, optimizers, etc.). MLJFlux wraps models built with Flux which provides a more high-level interface for building and training such models. More importantly, it empowers Flux models by extending their support to many common machine learning workflows that are possible via MLJ such as:

  • Estimating performance of your model using a holdout set or other resampling strategy (e.g., cross-validation) as measured by one or more metrics (e.g., loss functions) that may not have been used in training

  • Optimizing hyper-parameters such as a regularization parameter (e.g., dropout) or a width/height/nchannnels of convolution layer

  • Compose with other models such as introducing data pre-processing steps (e.g., missing data imputation) into a pipeline. It might make sense to include non-deep learning models in this pipeline. Other kinds of model composition could include blending predictions of a deep learner with some other kind of model (as in “model stacking”). Models composed with MLJ can be also tuned as a single unit.

  • Controlling iteration by adding an early stopping criterion based on an out-of-sample estimate of the loss, dynamically changing the learning rate (eg, cyclic learning rates), periodically save snapshots of the model, generate live plots of sample weights to judge training progress (as in tensor board)

  • Comparing your model with a non-deep learning models

A comparable project, FastAI/FluxTraining, also provides a high-level interface for interacting with Flux models and supports a set of features that may overlap with (but not include all of) those supported by MLJFlux.

Many of the features mentioned above are showcased in the workflow examples that you can access from the sidebar.

diff --git a/dev/interface/Builders/index.html b/dev/interface/Builders/index.html index df59949c..b042ff10 100644 --- a/dev/interface/Builders/index.html +++ b/dev/interface/Builders/index.html @@ -1,5 +1,5 @@ -Builders · MLJFlux
MLJFlux.LinearType
Linear(; σ=Flux.relu)

MLJFlux builder that constructs a fully connected two layer network with activation function σ. The number of input and output nodes is determined from the data. Weights are initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.ShortType
Short(; n_hidden=0, dropout=0.5, σ=Flux.sigmoid)

MLJFlux builder that constructs a full-connected three-layer network using n_hidden nodes in the hidden layer and the specified dropout (defaulting to 0.5). An activation function σ is applied between the hidden and final layers. If n_hidden=0 (the default) then n_hidden is the geometric mean of the number of input and output nodes. The number of input and output nodes is determined from the data.

Each layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.MLPType
MLP(; hidden=(100,), σ=Flux.relu)

MLJFlux builder that constructs a Multi-layer perceptron network. The ith element of hidden represents the number of neurons in the ith hidden layer. An activation function σ is applied between each layer.

Each layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.@builderMacro
@builder neural_net

Creates a builder for neural_net. The variables rng, n_in, n_out and n_channels can be used to create builders for any random number generator rng, input and output sizes n_in and n_out and number of input channels n_channels.

Examples

julia> import MLJFlux: @builder;
+Builders · MLJFlux
MLJFlux.LinearType
Linear(; σ=Flux.relu)

MLJFlux builder that constructs a fully connected two layer network with activation function σ. The number of input and output nodes is determined from the data. Weights are initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.ShortType
Short(; n_hidden=0, dropout=0.5, σ=Flux.sigmoid)

MLJFlux builder that constructs a full-connected three-layer network using n_hidden nodes in the hidden layer and the specified dropout (defaulting to 0.5). An activation function σ is applied between the hidden and final layers. If n_hidden=0 (the default) then n_hidden is the geometric mean of the number of input and output nodes. The number of input and output nodes is determined from the data.

Each layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.MLPType
MLP(; hidden=(100,), σ=Flux.relu)

MLJFlux builder that constructs a Multi-layer perceptron network. The ith element of hidden represents the number of neurons in the ith hidden layer. An activation function σ is applied between each layer.

Each layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.

source
MLJFlux.@builderMacro
@builder neural_net

Creates a builder for neural_net. The variables rng, n_in, n_out and n_channels can be used to create builders for any random number generator rng, input and output sizes n_in and n_out and number of input channels n_channels.

Examples

julia> import MLJFlux: @builder;
 
 julia> nn = NeuralNetworkRegressor(builder = @builder(Chain(Dense(n_in, 64, relu),
                                                             Dense(64, 32, relu),
@@ -11,4 +11,4 @@
            Chain(front, Dense(d, n_out));
        end
 
-julia> conv_nn = NeuralNetworkRegressor(builder = conv_builder);
source
+julia> conv_nn = NeuralNetworkRegressor(builder = conv_builder);
source
diff --git a/dev/interface/Classification/index.html b/dev/interface/Classification/index.html index 7f15ca00..e1749d22 100644 --- a/dev/interface/Classification/index.html +++ b/dev/interface/Classification/index.html @@ -20,7 +20,7 @@ xlab=curve.parameter_name, xscale=curve.parameter_scale, ylab = "Cross Entropy") -

See also ImageClassifier, NeuralNetworkBinaryClassifier.

source
MLJFlux.NeuralNetworkBinaryClassifierType
NeuralNetworkBinaryClassifier

A model type for constructing a neural network binary classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux

Do model = NeuralNetworkBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkBinaryClassifier(builder=...).

NeuralNetworkBinaryClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a binary (Multiclass{2} or OrderedFactor{2}) target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.

  • y is the target, which can be any AbstractVector whose element scitype is Multiclass{2} or OrderedFactor{2}; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.binarycrossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.binarycrossentropy: Standard binary classification loss, also known as the log loss.

    • Flux.logitbinarycrossentropy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with σ and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default sigmoid finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).

    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.

    • Flux.binary_focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.σ: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.σ.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.

  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ, Flux
+

See also ImageClassifier, NeuralNetworkBinaryClassifier.

source
MLJFlux.NeuralNetworkBinaryClassifierType
NeuralNetworkBinaryClassifier

A model type for constructing a neural network binary classifier, based on MLJFlux.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

NeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux

Do model = NeuralNetworkBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkBinaryClassifier(builder=...).

NeuralNetworkBinaryClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a binary (Multiclass{2} or OrderedFactor{2}) target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

Here:

  • X is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.

  • y is the target, which can be any AbstractVector whose element scitype is Multiclass{2} or OrderedFactor{2}; check the scitype with scitype(y)

Train the machine with fit!(mach, rows=...).

Hyper-parameters

  • builder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.

  • optimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.

  • loss=Flux.binarycrossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:

    • Flux.binarycrossentropy: Standard binary classification loss, also known as the log loss.

    • Flux.logitbinarycrossentropy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with σ and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default sigmoid finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).

    • Flux.tversky_loss: Used with imbalanced data to give more weight to false negatives.

    • Flux.binary_focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.

    Currently MLJ measures are not supported values of loss.

  • epochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.

  • batch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.

  • lambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).

  • alpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.

  • rng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.

  • optimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.

  • acceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().

  • finaliser=Flux.σ: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.σ.

Operations

  • predict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.

  • predict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.

Fitted parameters

The fields of fitted_params(mach) are:

  • chain: The trained "chain" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).

Report

The fields of report(mach) are:

  • training_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.

Examples

In this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.

using MLJ, Flux
 import Optimisers
 import RDatasets

First, we can load the data:

mtcars = RDatasets.dataset("datasets", "mtcars");
 y, X = unpack(mtcars, ==(:VS), in([:MPG, :Cyl, :Disp, :HP, :WT, :QSec]));

Note that y is a vector and X a table.

y = categorical(y) # classifier takes catogorical input
@@ -48,4 +48,4 @@
    xscale=curve.parameter_scale,
    ylab = "Cross Entropy",
 )
-

See also ImageClassifier.

source
+

See also ImageClassifier.

source diff --git a/dev/interface/Custom Builders/index.html b/dev/interface/Custom Builders/index.html index abcd64e0..184668c7 100644 --- a/dev/interface/Custom Builders/index.html +++ b/dev/interface/Custom Builders/index.html @@ -12,4 +12,4 @@ Dense(nn.n2, n_out, init=init), ) end

Note here that n_in and n_out depend on the size of the data (see Table 1).

For a concrete image classification example, see Using MLJ to classifiy the MNIST image dataset.

More generally, defining a new builder means defining a new struct sub-typing MLJFlux.Builder and defining a new MLJFlux.build method with one of these signatures:

MLJFlux.build(builder::MyBuilder, rng, n_in, n_out)
-MLJFlux.build(builder::MyBuilder, rng, n_in, n_out, n_channels) # for use with `ImageClassifier`

This method must return a Flux.Chain instance, chain, subject to the following conditions:

  • chain(x) must make sense:

    • for any x <: Array{<:AbstractFloat, 2} of size (n_in, batch_size) where batch_size is any integer (for all models except ImageClassifier); or
    • for any x <: Array{<:Float32, 4} of size (W, H, n_channels, batch_size), where (W, H) = n_in, n_channels is 1 or 3, and batch_size is any integer (for use with ImageClassifier)
  • The object returned by chain(x) must be an AbstractFloat vector of length n_out.

Alternatively, use MLJFlux.@builder(neural_net) to automatically create a builder for any valid Flux chain expression neural_net, where the symbols n_in, n_out, n_channels and rng can appear literally, with the interpretations explained above. For example,

builder = MLJFlux.@builder Chain(Dense(n_in, 128), Dense(128, n_out, tanh))
+MLJFlux.build(builder::MyBuilder, rng, n_in, n_out, n_channels) # for use with `ImageClassifier`

This method must return a Flux.Chain instance, chain, subject to the following conditions:

  • chain(x) must make sense:

    • for any x <: Array{<:AbstractFloat, 2} of size (n_in, batch_size) where batch_size is any integer (for all models except ImageClassifier); or
    • for any x <: Array{<:Float32, 4} of size (W, H, n_channels, batch_size), where (W, H) = n_in, n_channels is 1 or 3, and batch_size is any integer (for use with ImageClassifier)
  • The object returned by chain(x) must be an AbstractFloat vector of length n_out.

Alternatively, use MLJFlux.@builder(neural_net) to automatically create a builder for any valid Flux chain expression neural_net, where the symbols n_in, n_out, n_channels and rng can appear literally, with the interpretations explained above. For example,

builder = MLJFlux.@builder Chain(Dense(n_in, 128), Dense(128, n_out, tanh))
diff --git a/dev/interface/Image Classification/index.html b/dev/interface/Image Classification/index.html index 0f2aa3e5..41aff2ba 100644 --- a/dev/interface/Image Classification/index.html +++ b/dev/interface/Image Classification/index.html @@ -46,4 +46,4 @@ resampling=Holdout(fraction_train=0.5), measure=cross_entropy, rows=1:1000, - verbosity=0)

See also NeuralNetworkClassifier.

source + verbosity=0)

See also NeuralNetworkClassifier.

source diff --git a/dev/interface/Multitarget Regression/index.html b/dev/interface/Multitarget Regression/index.html index 43d62a51..3ac071cc 100644 --- a/dev/interface/Multitarget Regression/index.html +++ b/dev/interface/Multitarget Regression/index.html @@ -25,4 +25,4 @@ # loss for `(Xtest, test)`: fit!(mach) # trains on all data `(X, y)` yhat = predict(mach, Xtest) -multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

source +multi_loss(yhat, ytest)

See also NeuralNetworkRegressor

source diff --git a/dev/interface/Regression/index.html b/dev/interface/Regression/index.html index b3cfd94e..aaea26ea 100644 --- a/dev/interface/Regression/index.html +++ b/dev/interface/Regression/index.html @@ -43,4 +43,4 @@ # loss for `(Xtest, test)`: fit!(mach) # train on `(X, y)` yhat = predict(mach, Xtest) -l2(yhat, ytest)

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

source +l2(yhat, ytest)

These losses, for the pipeline model, refer to the target on the original, unstandardized, scale.

For implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.

See also MultitargetNeuralNetworkRegressor

source diff --git a/dev/interface/Summary/index.html b/dev/interface/Summary/index.html index 84a0dffc..2a79a02e 100644 --- a/dev/interface/Summary/index.html +++ b/dev/interface/Summary/index.html @@ -1,5 +1,5 @@ -Summary · MLJFlux

Models

MLJFlux provides the model types below, for use with input features X and targets y of the scientific type indicated in the table below. The parameters n_in, n_out and n_channels refer to information passed to the builder, as described under Defining Custom Builders.

Model TypePrediction typescitype(X) <: _scitype(y) <: _
NeuralNetworkRegressorDeterministicAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Continuous) (n_out = 1)
MultitargetNeuralNetworkRegressorDeterministicAbstractMatrix{Continuous} or Table(Continuous) with n_in columns<: Table(Continuous) with n_out columns
NeuralNetworkClassifierProbabilisticAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Finite} with n_out classes
NeuralNetworkBinaryClassifierProbabilisticAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Finite{2}} (but n_out = 1)
ImageClassifierProbabilisticAbstractVector(<:Image{W,H}) with n_in = (W, H)AbstractVector{<:Finite} with n_out classes
What exactly is a "model"?

In MLJ a model is a mutable struct storing hyper-parameters for some learning algorithm indicated by the model name, and that's all. In particular, an MLJ model does not store learned parameters.

Difference in Definition

In Flux the term "model" has another meaning. However, as all Flux "models" used in MLJFLux are Flux.Chain objects, we call them chains, and restrict use of "model" to models in the MLJ sense.

Are oberservations rows or columns?

In MLJ the convention for two-dimensional data (tables and matrices) is rows=obervations. For matrices Flux has the opposite convention. If your data is a matrix with whose column index the observation index, then your optimal solution is to present the adjoint or transpose of your matrix to MLJFlux models. Otherwise, you can use the matrix as is, or transform one time with permutedims, and again present the adjoint or transpose as the optimal solution for MLJFlux training.

Instructions for coercing common image formats into some AbstractVector{<:Image} are here.

Fitting and warm restarts

MLJ machines cache state enabling the "warm restart" of model training, as demonstrated in the incremental training example. In the case of MLJFlux models, fit!(mach) will use a warm restart if:

  • only model.epochs has changed since the last call; or

  • only model.epochs or model.optimiser have changed since the last call and model.optimiser_changes_trigger_retraining == false (the default) (the "state" part of the optimiser is ignored in this comparison). This allows one to dynamically modify learning rates, for example.

Here model=mach.model is the associated MLJ model.

The warm restart feature makes it possible to externally control iteration. See, for example, Early Stopping with MLJFlux and Using MLJ to classifiy the MNIST image dataset.

Model Hyperparameters.

All models share the following hyper-parameters. See individual model docstrings for a full list.

Hyper-parameterDescriptionDefault
builderDefault builder for models.MLJFlux.Linear(σ=Flux.relu) (regressors) or MLJFlux.Short(n_hidden=0, dropout=0.5, σ=Flux.σ) (classifiers)
optimiserThe optimiser to use for training.Optimiser.Adam()
lossThe loss function used for training.Flux.mse (regressors) and Flux.crossentropy (classifiers)
n_epochsNumber of epochs to train for.10
batch_sizeThe batch size for the data.1
lambdaThe regularization strength. Range = [0, ∞).0
alphaThe L2/L1 mix of regularization. Range = [0, 1].0
rngThe random number generator (RNG) passed to builders, for weight initialization, for example. Can be any AbstractRNG or the seed (integer) for a Xoshirio that is reset on every cold restart of model (machine) training.GLOBAL_RNG
accelerationUse CUDALibs() for training on GPU; default is CPU1().CPU1()
optimiser_changes_trigger_retrainingTrue if fitting an associated machine should trigger retraining from scratch whenever the optimiser changes.false

The classifiers have an additional hyperparameter finaliser (default is Flux.softmax, or Flux.σ in the binary case) which is the operation applied to the unnormalized output of the final layer to obtain probabilities (outputs summing to one). It should return a vector of the same length as its input.

Loss Functions

Currently, the loss function specified by loss=... is applied internally by Flux and needs to conform to the Flux API. You cannot, for example, supply one of MLJ's probabilistic loss functions, such as MLJ.cross_entropy to one of the classifier constructors.

That said, you can only use MLJ loss functions or metrics in evaluation meta-algorithms (such as cross validation) and they will work even if the underlying model comes from MLJFlux.

More on accelerated training with GPUs

As in the table, when instantiating a model for training on a GPU, specify acceleration=CUDALibs(), as in

using MLJ
+Summary · MLJFlux

Models

MLJFlux provides the model types below, for use with input features X and targets y of the scientific type indicated in the table below. The parameters n_in, n_out and n_channels refer to information passed to the builder, as described under Defining Custom Builders.

Model TypePrediction typescitype(X) <: _scitype(y) <: _
NeuralNetworkRegressorDeterministicAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Continuous) (n_out = 1)
MultitargetNeuralNetworkRegressorDeterministicAbstractMatrix{Continuous} or Table(Continuous) with n_in columns<: Table(Continuous) with n_out columns
NeuralNetworkClassifierProbabilisticAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Finite} with n_out classes
NeuralNetworkBinaryClassifierProbabilisticAbstractMatrix{Continuous} or Table(Continuous) with n_in columnsAbstractVector{<:Finite{2}} (but n_out = 1)
ImageClassifierProbabilisticAbstractVector(<:Image{W,H}) with n_in = (W, H)AbstractVector{<:Finite} with n_out classes
What exactly is a "model"?

In MLJ a model is a mutable struct storing hyper-parameters for some learning algorithm indicated by the model name, and that's all. In particular, an MLJ model does not store learned parameters.

Difference in Definition

In Flux the term "model" has another meaning. However, as all Flux "models" used in MLJFLux are Flux.Chain objects, we call them chains, and restrict use of "model" to models in the MLJ sense.

Are oberservations rows or columns?

In MLJ the convention for two-dimensional data (tables and matrices) is rows=obervations. For matrices Flux has the opposite convention. If your data is a matrix with whose column index the observation index, then your optimal solution is to present the adjoint or transpose of your matrix to MLJFlux models. Otherwise, you can use the matrix as is, or transform one time with permutedims, and again present the adjoint or transpose as the optimal solution for MLJFlux training.

Instructions for coercing common image formats into some AbstractVector{<:Image} are here.

Fitting and warm restarts

MLJ machines cache state enabling the "warm restart" of model training, as demonstrated in the incremental training example. In the case of MLJFlux models, fit!(mach) will use a warm restart if:

  • only model.epochs has changed since the last call; or

  • only model.epochs or model.optimiser have changed since the last call and model.optimiser_changes_trigger_retraining == false (the default) (the "state" part of the optimiser is ignored in this comparison). This allows one to dynamically modify learning rates, for example.

Here model=mach.model is the associated MLJ model.

The warm restart feature makes it possible to externally control iteration. See, for example, Early Stopping with MLJFlux and Using MLJ to classifiy the MNIST image dataset.

Model Hyperparameters.

All models share the following hyper-parameters. See individual model docstrings for a full list.

Hyper-parameterDescriptionDefault
builderDefault builder for models.MLJFlux.Linear(σ=Flux.relu) (regressors) or MLJFlux.Short(n_hidden=0, dropout=0.5, σ=Flux.σ) (classifiers)
optimiserThe optimiser to use for training.Optimiser.Adam()
lossThe loss function used for training.Flux.mse (regressors) and Flux.crossentropy (classifiers)
n_epochsNumber of epochs to train for.10
batch_sizeThe batch size for the data.1
lambdaThe regularization strength. Range = [0, ∞).0
alphaThe L2/L1 mix of regularization. Range = [0, 1].0
rngThe random number generator (RNG) passed to builders, for weight initialization, for example. Can be any AbstractRNG or the seed (integer) for a Xoshirio that is reset on every cold restart of model (machine) training.GLOBAL_RNG
accelerationUse CUDALibs() for training on GPU; default is CPU1().CPU1()
optimiser_changes_trigger_retrainingTrue if fitting an associated machine should trigger retraining from scratch whenever the optimiser changes.false

The classifiers have an additional hyperparameter finaliser (default is Flux.softmax, or Flux.σ in the binary case) which is the operation applied to the unnormalized output of the final layer to obtain probabilities (outputs summing to one). It should return a vector of the same length as its input.

Loss Functions

Currently, the loss function specified by loss=... is applied internally by Flux and needs to conform to the Flux API. You cannot, for example, supply one of MLJ's probabilistic loss functions, such as MLJ.cross_entropy to one of the classifier constructors.

That said, you can only use MLJ loss functions or metrics in evaluation meta-algorithms (such as cross validation) and they will work even if the underlying model comes from MLJFlux.

More on accelerated training with GPUs

As in the table, when instantiating a model for training on a GPU, specify acceleration=CUDALibs(), as in

using MLJ
 ImageClassifier = @load ImageClassifier
 model = ImageClassifier(epochs=10, acceleration=CUDALibs())
-mach = machine(model, X, y) |> fit!

In this example, the data X, y is copied onto the GPU under the hood on the call to fit! and cached for use in any warm restart (see above). The Flux chain used in training is always copied back to the CPU at then conclusion of fit!, and made available as fitted_params(mach).

Builders

BuilderDescription
MLJFlux.MLP(hidden=(10,))General multi-layer perceptron
MLJFlux.Short(n_hidden=0, dropout=0.5, σ=sigmoid)Fully connected network with one hidden layer and dropout
MLJFlux.Linear(σ=relu)Vanilla linear network with no hidden layers and activation function σ
MLJFlux.@builderMacro for customized builders
+mach = machine(model, X, y) |> fit!

In this example, the data X, y is copied onto the GPU under the hood on the call to fit! and cached for use in any warm restart (see above). The Flux chain used in training is always copied back to the CPU at then conclusion of fit!, and made available as fitted_params(mach).

Builders

BuilderDescription
MLJFlux.MLP(hidden=(10,))General multi-layer perceptron
MLJFlux.Short(n_hidden=0, dropout=0.5, σ=sigmoid)Fully connected network with one hidden layer and dropout
MLJFlux.Linear(σ=relu)Vanilla linear network with no hidden layers and activation function σ
MLJFlux.@builderMacro for customized builders
diff --git a/dev/objects.inv b/dev/objects.inv index f8c1e3a05e0e579ba1efbe3b95c5403f233288a3..6744df90d2e1cda07429a8050d2a366eb9e85269 100644 GIT binary patch delta 1626 zcmV-g2BrD<4f+j`vwzEdegJuqv1gDfJI~4#$S(j_g%V{}rHYqD!VAU3lBR4nuhwfS zUc#+Ai!*Tl1aj{xo@Y$RZb82~1OHzjdRa+Tu3Z- zd$+u+e)Y0pVnYR8Gkl+gs>bWzFFJ$2qII~hx-GyTHl^goXQT9X31CFajOB3ItaWRm zg3Bq2TKhvic7JJC4WlA|SztA7HluynBQt{!dI9EcVOLqspV-8_=@yy6HU7RQQ>aK+ zT(MMDf-NM|kK^?16P&>dnR6~j)OE!Z<%Z*nGQH@8>w`%bMxdoghO z*%qbjfR-D&USv$0(3gp$g|MaL0=)MNu;pq6cel3^-G7RcfYwR^J>wKkZ>7cLd&H*=z6 zO6r)}SAW$p9rAWxlAtn#?N60^gd(2>+lVsNgccboE2V#z#vYQF zQZhN&>H~D>S7^kbT$+UOVt`^0iGee%RnVr8wSRl;fIKpN<3ctCtR3oxuQS)FHuWtB z6m|_k)KeoFMHVT!Mth9QE-UfGhL8vN11j-&VDYf!NJR_9sd9chq96h+>!;)$l{~G( zKD~z;`8xi|hbNj>+R{48M_B8@ny@COC4LuhgA^cZB5+Vk?_m}?k|DQ|wj*1)VK|LRGk!$VX_Qg>8(v01GPjBw4@ zNJbTvOiiyzgRi3ns)QQf=DOj71|J+g)_-i7S7y3n*WNPzxKam@sTxJk1c&0%JUE|B zDXh*TPhA25XsfjB-_tMo==_{u)S*clnHaa3xONXO*|^iR&QRTTF($C28Qq@>=81v{ zf+X0CsrDkvI{NY0v09IB#P^KC;$x*DEo$(CVHnMGa$9am&eYFR9|xooRf-&>5`T8} zl-#elM7oh{D>OmlYo1dc(QLyU<~8=3)#3g!i&#zdke?CL3KOmLW* zS6WOkm_ZdfvNy2^J3`jJ5Qy{p)58}%ad+5|$)-P9VUAVkv|FaMCK8^5hX~XV7i{3c z!AHkznOtO<)?KizPAa4 zw$Y_lC=$Lw%RdbW@@#T)cIJWb-$5u##f$~eSYViNEC!)ia#7IyA{`> z-E^FndiYO0*0dAtd}dh}Q6D|(>#3#v=Mf!YAD5zg<|RqYr4m+^tKr{U)!14-dHWGe zlfZ)V650yQ>C9UnJYpM=#_wG>O+VstDY18vQ4jWAV8WajY6bl_p7H2-$BWFT&{p#T z%pf~gu}kYVHu;ic5K&)gxPMJ1&)gRgf!l z=Qxf|?aA$$QgLNCQ~?oiqA$hloyd`|Qv=x-de4^JU54l{Yh>vBwEImCF+Ir6=eg(( z$scTxV;^_(+W^&Phu|BUOl-i8fJ9bZHlFA*tM!R4GA6>oBEVjL1e*EIFd*Oa;V{Cv<^|?ix-1skQhZ(JQ^rtCN@;iHN*8RR5f1ze$g5H6^+Av)olU(uqh=sJ{zUCO8|GY%vcVW%~-c4 zD!81Ys5L*-V}FNs)i5gZmj!0iW;5ECJu)-+pa)>?26mO@{E1DBn{JUAT;uP1B87@{ z#T83cCD=kT{W(tGKEWB>vPxitUIde1ZYCw{Suw1%*@8XO^(JR>c5{0*zwgvKG#58c zf7_ nf+^*Ncp46Z#@?v=Ekb9Dw(J0JdDM;O_QTqJLYlQ}BmMp^sy)HNRu|48D^8 z1T~va)QXXt>%Twd)iYV^FeDn6pxcBgF2sP8xh?uReZW8+1-+)$Pdz%wd`|7(`&n9m5!7PuhEw$bxR< zs#F9QkDnXCGHy`Lrv}qF*ps&x(98r1ekR~d{0t08nqo`JrP*m8FN78uDJ!Lam&P8F zmr^o0+3Evy=p{5_P%cftcridRh{V8|)=FrT$bZ^Bwm}{lzHuU(1lA69&DWXhRGa*k z0}8vEAnK`+j3SGaT%$e4WtWwBVnfIS`~j7CJg|6Ja-^b#;#4`m9gz?Lmi1Hej!K@^ z+djR98u@zvlMhccue7E0E+1j72Wz%9F(vW4fE%O$SrdVSN_r2o&^sA&i@1o}1c=F^ zJb#Tf?H-yC(=NL$XoDor8^T_rpbSak`7Y_m2kWMzBes+Cfbf)b!jm_*xsN5^8)K^M)51d~o=fyMJX~nJ$mjd&~IaN*zF|Y82fY9EuC~;G8nW zyPZd$Iu!!YR*Kobr(g2X2{FN_L$e|>F)lZ8{2d;0am8t!rMeSw%*K+Yg?}oTCk$p2 zB*CIgwHH~|QJ2TI*Sd`(zGoH|A5$CYQG*{e#i*~7+j2{Crhb>==nTt66KFn_?N27AhN z({XO2?ME<0 z0u#zhXe%^{G^;+i#Wo;~-@9&{e#9kHV(%oQ8tkjUgg!A;3i@w8y(7 zLAI}AmDXi!@@2;$qQ2U2lYdN{xk;|^Vu?3uai^)ld5>R5oIf25Xqbjsab>7fL9Wc* z<2X9CC$?)!#g*Yu1w_D!z8tf6BS$W$2C@r!&z9U>is%s WA3y)*`tyX@6GEW9>Hh=1HEL true, rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Instantiating-the-model","page":"Hyperparameter Tuning","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Now let's construct our model. This follows a similar setup the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=10,\n rng=42,\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Hyperparameter-Tuning-Example","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning Example","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Let's tune the batch size and the learning rate. We will use grid search and 5-fold cross-validation.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"We start by defining the hyperparameter ranges","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"r1 = range(clf, :batch_size, lower=1, upper=64)\netas = [10^x for x in range(-4, stop=0, length=4)]\noptimisers = [Optimisers.Adam(eta) for eta in etas]\nr2 = range(clf, :optimiser, values=optimisers)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then passing the ranges along with the model and other arguments to the TunedModel constructor.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"tuned_model = TunedModel(\n model=clf,\n tuning=Grid(goal=25),\n resampling=CV(nfolds=5, rng=42),\n range=[r1, r2],\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then wrapping our tuned model in a machine and fitting it.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"mach = machine(tuned_model, X, y);\nfit!(mach, verbosity=0);\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Let's check out the best performing model:","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"fitted_params(mach).best_model","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Learning-Curves","page":"Hyperparameter Tuning","title":"Learning Curves","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"With learning curves, it's possible to center our focus on the effects of a single hyperparameter of the model","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"First define the range and wrap it in a learning curve","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"r = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(\n clf,\n X,\n y,\n range=r,\n resampling=CV(nfolds=4, rng=42),\n measure=cross_entropy,\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then plot the curve","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"plot(\n curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\",\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/comparison/notebook/#Model-Comparison-with-MLJFlux","page":"Model Comparison","title":"Model Comparison with MLJFlux","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"In this workflow example, we see how we can compare different machine learning models with a neural network from MLJFlux.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/comparison/notebook/#Basic-Imports","page":"Model Comparison","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nusing DataFrames # To visualize hyperparameter search results\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/comparison/notebook/#Loading-and-Splitting-the-Data","page":"Model Comparison","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/#Instantiating-the-models-Now-let's-construct-our-model.-This-follows-a-similar-setup","page":"Model Comparison","title":"Instantiating the models Now let's construct our model. This follows a similar setup","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf1 = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42\n )","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Let's as well load and construct three other classical machine learning models:","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"BayesianLDA = @load BayesianLDA pkg=MultivariateStats\nclf2 = BayesianLDA()\nRandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree\nclf3 = RandomForestClassifier()\nXGBoostClassifier = @load XGBoostClassifier pkg=XGBoost\nclf4 = XGBoostClassifier();\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/#Wrapping-One-of-the-Models-in-a-TunedModel","page":"Model Comparison","title":"Wrapping One of the Models in a TunedModel","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Instead of just comparing with four models with the default/given hyperparameters, we will give XGBoostClassifier an unfair advantage By wrapping it in a TunedModel that considers the best learning rate η for the model.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"r1 = range(clf4, :eta, lower=0.01, upper=0.5, scale=:log10)\ntuned_model_xg = TunedModel(\n model=clf4,\n ranges=[r1],\n tuning=Grid(resolution=10),\n resampling=CV(nfolds=5, rng=42),\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Of course, one can wrap each of the four in a TunedModel if they are interested in comparing the models over a large set of their hyperparameters.","category":"page"},{"location":"common_workflows/comparison/notebook/#Comparing-the-models","page":"Model Comparison","title":"Comparing the models","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"We simply pass the four models to the models argument of the TunedModel construct","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"tuned_model = TunedModel(\n models=[clf1, clf2, clf3, tuned_model_xg],\n tuning=Explicit(),\n resampling=CV(nfolds=5, rng=42),\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Then wrapping our tuned model in a machine and fitting it.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"mach = machine(tuned_model, X, y);\nfit!(mach, verbosity=0);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Now let's see the history for more details on the performance for each of the models","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"history = report(mach).history\nhistory_df = DataFrame(\n mlp = [x[:model] for x in history],\n measurement = [x[:measurement][1] for x in history],\n)\nsort!(history_df, [order(:measurement)])","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This is Occam's razor in practice.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This page was generated using Literate.jl.","category":"page"},{"location":"contributing/#Adding-new-models-to-MLJFlux","page":"Contributing","title":"Adding new models to MLJFlux","text":"","category":"section"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"This section assumes familiarity with the MLJ model API","category":"page"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"If one subtypes a new model type as either MLJFlux.MLJFluxProbabilistic or MLJFlux.MLJFluxDeterministic, then instead of defining new methods for MLJModelInterface.fit and MLJModelInterface.update one can make use of fallbacks by implementing the lower level methods shape, build, and fitresult. See the classifier source code for an example.","category":"page"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"One still needs to implement a new predict method.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"extended_examples/spam_detection/notebook/#SMS-Spam-Detection-with-RNNs","page":"Spam Detection with RNNs","title":"SMS Spam Detection with RNNs","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"In this demo we use a custom RNN model from Flux with MLJFlux to classify text messages as spam or ham. We will be using the SMS Collection Dataset from Kaggle.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Warning. This demo includes some non-idiomatic use of MLJ to allow use of the Flux.jl Embedding layer. It is not recommended for MLJ beginners.","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Basic-Imports","page":"Spam Detection with RNNs","title":"Basic Imports","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"using MLJ\nusing MLJFlux\nusing Flux\nimport Optimisers # Flux.jl native optimisers no longer supported\nusing CSV # Read data\nusing DataFrames # Read data\nusing WordTokenizers # For tokenization\nusing Languages # For stop words","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Reading-Data","page":"Spam Detection with RNNs","title":"Reading Data","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"We assume the SMS Collection Dataset has been downloaded and is in a file called \"sms.csv\" in the same directory as the this script.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"df = CSV.read(joinpath(@__DIR__, \"sms.csv\"), DataFrame);\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Display the first 5 rows with DataFrames","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"first(df, 5)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Text-Preprocessing","page":"Spam Detection with RNNs","title":"Text Preprocessing","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Let's define a function that given an SMS message would:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Tokenize it (i.e., convert it into a vector of words)\nRemove stop words (i.e., words that are not useful for the analysis, like \"the\", \"a\", etc.)\nReturn the filtered vector of words","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"const STOP_WORDS = Languages.stopwords(Languages.English())\n\nfunction preprocess_text(text)\n # (1) Splitting texts into words (so later it can be a sequence of vectors)\n tokens = WordTokenizers.tokenize(text)\n\n # (2) Stop word removal\n filtered_tokens = filter(token -> !(token in STOP_WORDS), tokens)\n\n return filtered_tokens\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Define the vocabulary to be the set of all words in our training set. We also need a function that would map each word in a given sequence of words into its index in the dictionary (which is equivalent to representing the words as one-hot vectors).","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Now after we do this the sequences will all be numerical vectors but they will be of unequal length. Thus, to facilitate batching of data for the deep learning model, we need to decide on a specific maximum length for all sequences and:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"If a sequence is longer than the maximum length, we need to truncate it\nIf a sequence is shorter than the maximum length, we need to pad it with a new token","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Lastly, we must also handle the case that an incoming text sequence may involve words never seen in training by represent all such out-of-vocabulary words with a new token.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"We will define a function that would do this for us.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"function encode_and_equalize(text_seq, vocab_dict, max_length, pad_val, oov_val)\n # (1) encode using the vocabulary\n text_seq_inds = [get(vocab_dict, word, oov_val) for word in text_seq]\n\n # (2) truncate sequence if > max_length\n length(text_seq_inds) > max_length && (text_seq_inds = text_seq_inds[1:max_length])\n\n # (3) pad with pad_val\n text_seq_inds = vcat(text_seq_inds, fill(pad_val, max_length - length(text_seq_inds)))\n\n return text_seq_inds\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Preparing-Data","page":"Spam Detection with RNNs","title":"Preparing Data","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Splitting the data","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_data, y_data = unpack(df, ==(:Message), ==(:Category))\ny_data = coerce(y_data, Multiclass);\n\n(x_train, x_val), (y_train, y_val) = partition(\n (x_data, y_data),\n 0.8,\n multi = true,\n shuffle = true,\n rng = 42,\n);\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Now let's process the training and validation sets:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed = [preprocess_text(text) for text in x_train]\nx_val_processed = [preprocess_text(text) for text in x_val];\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"sanity check","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"println(x_train_processed[1], \" is \", y_data[1])","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Define the vocabulary from the training data","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"vocab = unique(vcat(x_train_processed...))\nvocab_dict = Dict(word => idx for (idx, word) in enumerate(vocab))\nvocab_size = length(vocab)\npad_val, oov_val = vocab_size + 1, vocab_size + 2\nmax_length = 12 # can choose this more smartly if you wish","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Encode and equalize training and validation data:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed_equalized = [\n encode_and_equalize(seq, vocab_dict, max_length, pad_val, oov_val) for\n seq in x_train_processed\n ]\nx_val_processed_equalized = [\n encode_and_equalize(seq, vocab_dict, max_length, pad_val, oov_val) for\n seq in x_val_processed\n ]\nx_train_processed_equalized[1:5] # all sequences are encoded and of the same length","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Convert both structures into matrix form:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"matrixify(v) = reduce(hcat, v)'\nx_train_processed_equalized_fixed = matrixify(x_train_processed_equalized)\nx_val_processed_equalized_fixed = matrixify(x_val_processed_equalized)\nsize(x_train_processed_equalized_fixed)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Instantiate-Model","page":"Spam Detection with RNNs","title":"Instantiate Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For the model, we will use a RNN from Flux. We will average the hidden states corresponding to any sequence then pass that to a dense layer for classification.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For this, we need to define a custom Flux layer to perform the averaging operation:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"struct Mean end\nFlux.@layer Mean\n(m::Mean)(x) = mean(x, dims = 2)[:, 1, :] # [batch_size, seq_len, hidden_dim] => [batch_size, 1, hidden_dim]=> [batch_size, hidden_dim]","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For compatibility, we will also define a layer that simply casts the input to integers as the embedding layer in Flux expects integers but the MLJFlux model expects floats:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"struct Intify end\nFlux.@layer Intify\n(m::Intify)(x) = Int.(x)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Here we define our network:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"builder = MLJFlux.@builder begin\n Chain(\n Intify(), # Cast input to integer\n Embedding(vocab_size + 2 => 300), # Embedding layer\n RNN(300, 50, tanh), # RNN layer\n Mean(), # Mean pooling layer\n Dense(50, 2), # Classification dense layer\n )\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Notice that we used an embedding layer with input dimensionality vocab_size + 2 to take into account the padding and out-of-vocabulary tokens. Recall that the indices in our input correspond to one-hot-vectors and the embedding layer's purpose is to learn to map them into meaningful dense vectors (of dimensionality 300 here).","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Load and instantiate model","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg = MLJFlux\nclf = NeuralNetworkClassifier(\n builder = builder,\n optimiser = Optimisers.Adam(0.1),\n batch_size = 128,\n epochs = 10,\n)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Wrap it in a machine","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed_equalized_fixed = coerce(x_train_processed_equalized_fixed, Continuous)\nmach = machine(clf, x_train_processed_equalized_fixed, y_train)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Train-the-Model","page":"Spam Detection with RNNs","title":"Train the Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"fit!(mach)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Evaluate-the-Model","page":"Spam Detection with RNNs","title":"Evaluate the Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"ŷ = predict_mode(mach, x_val_processed_equalized_fixed)\nbalanced_accuracy(ŷ, y_val)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Acceptable performance. Let's see some live examples:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"using Random: Random;\nRandom.seed!(99);\n\nz = rand(x_val)\nz_processed = preprocess_text(z)\nz_encoded_equalized =\n encode_and_equalize(z_processed, vocab_dict, max_length, pad_val, oov_val)\nz_encoded_equalized_fixed = matrixify([z_encoded_equalized])\nz_encoded_equalized_fixed = coerce(z_encoded_equalized_fixed, Continuous)\nz_pred = predict_mode(mach, z_encoded_equalized_fixed)\n\nprint(\"SMS: `$(z)` and the prediction is `$(z_pred)`\")","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/composition/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/composition/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/composition/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/composition/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Incremental-Training-with-MLJFlux","page":"Incremental Training","title":"Incremental Training with MLJFlux","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"In this workflow example we explore how to incrementally train MLJFlux models.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Julia version is assumed to be 1.10.* This tutorial is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Basic-Imports","page":"Incremental Training","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Loading-and-Splitting-the-Data","page":"Incremental Training","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\nX = Float32.(X) # To be compatible with type of network network parameters\n(X_train, X_test), (y_train, y_test) = partition(\n (X, y), 0.8,\n multi = true,\n shuffle = true,\n rng=42,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Instantiating-the-model","page":"Incremental Training","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=10,\n rng=42,\n)","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Initial-round-of-training","page":"Incremental Training","title":"Initial round of training","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's train the model. Calling fit! will automatically train it for 100 epochs as specified above.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"mach = machine(clf, X_train, y_train)\nfit!(mach)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Let's evaluate the training loss and validation accuracy","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_loss = cross_entropy(predict(mach, X_train), y_train)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"val_acc = accuracy(predict_mode(mach, X_test), y_test)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Poor performance it seems.","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Incremental-Training","page":"Incremental Training","title":"Incremental Training","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's train it for another 30 epochs at half the original learning rate. All we need to do is changes these hyperparameters and call fit again. It won't reset the model parameters before training.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"clf.optimiser = Optimisers.Adam(clf.optimiser.eta/2)\nclf.epochs = clf.epochs + 30\nfit!(mach, verbosity=2);\nnothing #hide","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Let's evaluate the training loss and validation accuracy","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_loss = cross_entropy(predict(mach, X_train), y_train)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_acc = accuracy(predict_mode(mach, X_test), y_test)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"That's much better. If we are rather interested in resetting the model parameters before fitting, we can do fit(mach, force=true).","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"This page was generated using Literate.jl.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"extended_examples/MNIST/notebook/#Using-MLJ-to-classifiy-the-MNIST-image-dataset","page":"MNIST Images","title":"Using MLJ to classifiy the MNIST image dataset","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"This tutorial is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"using MLJ\nusing Flux\nimport MLJFlux\nimport MLUtils\nimport MLJIteration # for `skip`","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"If running on a GPU, you will also need to import CUDA and import cuDNN.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"using Plots\ngr(size=(600, 300*(sqrt(5)-1)));\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Basic-training","page":"MNIST Images","title":"Basic training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Downloading the MNIST image dataset:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"import MLDatasets: MNIST\n\nENV[\"DATADEPS_ALWAYS_ACCEPT\"] = true\nimages, labels = MNIST(split=:train)[:];\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"In MLJ, integers cannot be used for encoding categorical data, so we must force the labels to have the Multiclass scientific type. For more on this, see Working with Categorical Data.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"labels = coerce(labels, Multiclass);\nimages = coerce(images, GrayImage);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Checking scientific types:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"@assert scitype(images) <: AbstractVector{<:Image}\n@assert scitype(labels) <: AbstractVector{<:Finite}","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Looks good.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For general instructions on coercing image data, see Type coercion for image data","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"images[1]","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"We start by defining a suitable Builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customisable.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"import MLJFlux\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) # padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n MLUtils.flatten)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Notes.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"There is no final softmax here, as this is applied by default in all MLJFLux classifiers. Customisation of this behaviour is controlled using using the finaliser hyperparameter of the classifier.\nInstead of calculating the padding p, Flux can infer the required padding in each dimension, which you enable by replacing pad = (p, p) with pad = SamePad().","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"We now define the MLJ model.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"ImageClassifier = @load ImageClassifier\nclf = ImageClassifier(\n builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123,\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"You can add Flux options optimiser=... and loss=... in the above constructor call. At present, loss must be a Flux-compatible loss, not an MLJ measure. To run on a GPU, add to the constructor acceleration=CUDALib() and omit rng.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For illustration purposes, we won't use all the data here:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"train = 1:500\ntest = 501:1000","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Binding the model with data in an MLJ machine:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach = machine(clf, images, labels);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Training for 10 epochs on the first 500 images:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"fit!(mach, rows=train, verbosity=2);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Inspecting:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"report(mach)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"chain = fitted_params(mach)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Flux.params(chain)[2]","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Adding 20 more epochs:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"clf.epochs = clf.epochs + 20\nfit!(mach, rows=train);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Computing an out-of-sample estimate of the loss:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"predicted_labels = predict(mach, rows=test);\ncross_entropy(predicted_labels, labels[test])","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Or to fit and predict, in one line:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)","category":"page"},{"location":"extended_examples/MNIST/notebook/#Wrapping-the-MLJFlux-model-with-iteration-controls","page":"MNIST Images","title":"Wrapping the MLJFlux model with iteration controls","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Any iterative MLJFlux model can be wrapped in iteration controls, as we demonstrate next. For more on MLJ's IteratedModel wrapper, see the MLJ documentation.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The \"self-iterating\" classifier, called iterated_clf below, is for iterating the image classifier defined above until one of the following stopping criterion apply:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Patience(3): 3 consecutive increases in the loss\nInvalidValue(): an out-of-sample loss, or a training loss, is NaN, Inf, or -Inf\nTimeLimit(t=5/60): training time has exceeded 5 minutes","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"These checks (and other controls) will be applied every two epochs (because of the Step(2) control). Additionally, training a machine bound to iterated_clf will:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"save a snapshot of the machine every three control cycles (every six epochs)\nrecord traces of the out-of-sample loss and training losses for plotting\nrecord mean value traces of each Flux parameter for plotting","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For a complete list of controls, see this table.","category":"page"},{"location":"extended_examples/MNIST/notebook/#Wrapping-the-classifier","page":"MNIST Images","title":"Wrapping the classifier","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Some helpers","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To extract Flux params from an MLJFlux machine","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"parameters(mach) = vec.(Flux.params(fitted_params(mach)));\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To store the traces:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"losses = []\ntraining_losses = []\nparameter_means = Float32[];\nepochs = []","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To update the traces:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"update_loss(loss) = push!(losses, loss)\nupdate_training_loss(losses) = push!(training_losses, losses[end])\nupdate_means(mach) = append!(parameter_means, mean.(parameters(mach)));\nupdate_epochs(epoch) = push!(epochs, epoch)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The controls to apply:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"save_control =\n MLJIteration.skip(Save(joinpath(tempdir(), \"mnist.jls\")), predicate=3)\n\ncontrols=[\n Step(2),\n Patience(3),\n InvalidValue(),\n TimeLimit(5/60),\n save_control,\n WithLossDo(),\n WithLossDo(update_loss),\n WithTrainingLossesDo(update_training_loss),\n Callback(update_means),\n WithIterationsDo(update_epochs),\n];\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The \"self-iterating\" classifier:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"iterated_clf = IteratedModel(\n clf,\n controls=controls,\n resampling=Holdout(fraction_train=0.7),\n measure=log_loss,\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/#Binding-the-wrapped-model-to-data:","page":"MNIST Images","title":"Binding the wrapped model to data:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach = machine(iterated_clf, images, labels);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Training","page":"MNIST Images","title":"Training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"fit!(mach, rows=train);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Comparison-of-the-training-and-out-of-sample-losses:","page":"MNIST Images","title":"Comparison of the training and out-of-sample losses:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"plot(\n epochs,\n losses,\n xlab = \"epoch\",\n ylab = \"cross entropy\",\n label=\"out-of-sample\",\n)\nplot!(epochs, training_losses, label=\"training\")\n\nsavefig(joinpath(tempdir(), \"loss.png\"))","category":"page"},{"location":"extended_examples/MNIST/notebook/#Evolution-of-weights","page":"MNIST Images","title":"Evolution of weights","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"n_epochs = length(losses)\nn_parameters = div(length(parameter_means), n_epochs)\nparameter_means2 = reshape(copy(parameter_means), n_parameters, n_epochs)'\nplot(\n epochs,\n parameter_means2,\n title=\"Flux parameter mean weights\",\n xlab = \"epoch\",\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Note. The higher the number in the plot legend, the deeper the layer we are **weight-averaging.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"savefig(joinpath(tempdir(), \"weights.png\"))","category":"page"},{"location":"extended_examples/MNIST/notebook/#Retrieving-a-snapshot-for-a-prediction:","page":"MNIST Images","title":"Retrieving a snapshot for a prediction:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach2 = machine(joinpath(tempdir(), \"mnist3.jls\"))\npredict_mode(mach2, images[501:503])","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"3-element CategoricalArrays.CategoricalArray{Int64,1,UInt32}:\n 7\n 9\n 5","category":"page"},{"location":"extended_examples/MNIST/notebook/#Restarting-training","page":"MNIST Images","title":"Restarting training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Mutating iterated_clf.controls or clf.epochs (which is otherwise ignored) will allow you to restart training from where it left off.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"iterated_clf.controls[2] = Patience(4)\nfit!(mach, rows=train)\n\nplot(\n epochs,\n losses,\n xlab = \"epoch\",\n ylab = \"cross entropy\",\n label=\"out-of-sample\",\n)\nplot!(epochs, training_losses, label=\"training\")","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Multitarget Regression/","page":"Multi-Target Regression","title":"Multi-Target Regression","text":"MLJFlux.MultitargetNeuralNetworkRegressor","category":"page"},{"location":"interface/Multitarget Regression/#MLJFlux.MultitargetNeuralNetworkRegressor","page":"Multi-Target Regression","title":"MLJFlux.MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor\n\nA model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nMultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux\n\nDo model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).\n\nMultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.\n\nHyper-parameters\n\nbuilder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we apply a multi-target regression model to synthetic data:\n\nusing MLJ\nimport MLJFlux\nusing Flux\nimport Optimisers\n\nFirst, we generate some synthetic data (needs MLJBase 0.20.16 or higher):\n\nX, y = make_regression(100, 9; n_targets = 2) # both tables\nschema(y)\nschema(X)\n\nSplitting off a test set:\n\n(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);\n\nNext, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features and n_out the number of target variables (both known at fit! time), while rng is a proxy for a RNG (which will be passed from the rng field of model defined below).\n\nbuilder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend\n\nInstantiating the regression model:\n\nMultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor\nmodel = MultitargetNeuralNetworkRegressor(builder=builder, rng=123, epochs=20)\n\nWe will arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:\n\npipe = Standardizer |> TransformedTargetModel(model, transformer=Standardizer)\n\nIf we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach)\n\nmach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n# first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses\n\nFor experimenting with learning rate, see the NeuralNetworkRegressor example.\n\npipe.transformed_target_model_deterministic.model.optimiser = Optimisers.Adam(0.0001)\n\nWith the learning rate fixed, we can now compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:\n\n# custom MLJ loss:\nmulti_loss(yhat, y) = l2(MLJ.matrix(yhat), MLJ.matrix(y))\n\n# CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=multi_loss)\n\n# loss for `(Xtest, test)`:\nfit!(mach) # trains on all data `(X, y)`\nyhat = predict(mach, Xtest)\nmulti_loss(yhat, ytest)\n\nSee also NeuralNetworkRegressor\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Early-Stopping-with-MLJFlux","page":"Early Stopping","title":"Early Stopping with MLJFlux","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"In this workflow example, we learn how MLJFlux enables us to easily use early stopping when training MLJFlux models.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Basic-Imports","page":"Early Stopping","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nusing Plots # To visualize training\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Loading-and-Splitting-the-Data","page":"Early Stopping","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Instantiating-the-model-Now-let's-construct-our-model.-This-follows-a-similar-setup","page":"Early Stopping","title":"Instantiating the model Now let's construct our model. This follows a similar setup","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Wrapping-it-in-an-IteratedModel","page":"Early Stopping","title":"Wrapping it in an IteratedModel","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Let's start by defining the condition that can cause the model to early stop.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"stop_conditions = [\n Step(1), # Repeatedly train for one iteration\n NumberLimit(100), # Don't train for more than 100 iterations\n Patience(5), # Stop after 5 iterations of disimprovement in validation loss\n NumberSinceBest(9), # Or if the best loss occurred 9 iterations ago\n TimeLimit(30/60), # Or if 30 minutes passed\n]","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"We can also define callbacks. Here we want to store the validation loss for each iteration","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"validation_losses = []\ncallbacks = [\n WithLossDo(loss->push!(validation_losses, loss)),\n]","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Construct the iterated model and pass to it the stop_conditions and the callbacks:","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"iterated_model = IteratedModel(\n model=clf,\n resampling=Holdout(fraction_train=0.7); # loss and stopping are based on out-of-sample\n measures=log_loss,\n iteration_parameter=:(epochs),\n controls=vcat(stop_conditions, callbacks),\n retrain=false # no need to retrain on all data at the end\n);\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"You can see more advanced stopping conditions as well as how to involve callbacks in the documentation","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Training-with-Early-Stopping","page":"Early Stopping","title":"Training with Early Stopping","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"At this point, all we need is to fit the model and iteration controls will be automatically handled","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"mach = machine(iterated_model, X, y)\nfit!(mach)\n# We can get the training losses like so\ntraining_losses = report(mach)[:model_report].training_losses;\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Results","page":"Early Stopping","title":"Results","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"We can see that the model converged after 100 iterations.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"plot(training_losses, label=\"Training Loss\", linewidth=2)\nplot!(validation_losses, label=\"Validation Loss\", linewidth=2, size=(800,400))","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/early_stopping/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/early_stopping/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/early_stopping/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/early_stopping/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"extended_examples/spam_detection/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"extended_examples/spam_detection/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"extended_examples/spam_detection/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"extended_examples/spam_detection/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/live_training/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/live_training/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/live_training/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/live_training/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/hyperparameter_tuning/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"extended_examples/MNIST/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"extended_examples/MNIST/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"extended_examples/MNIST/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"extended_examples/MNIST/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/composition/notebook/#Model-Composition-with-MLJFlux","page":"Model Composition","title":"Model Composition with MLJFlux","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"This tutorial is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"In this workflow example, we see how MLJFlux enables composing MLJ models with MLJFlux models. We will assume a class imbalance setting and wrap an oversampler with a deep learning model from MLJFlux.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/composition/notebook/#Basic-Imports","page":"Model Composition","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nimport Random # To create imbalance\nimport Imbalance # To solve the imbalance\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/composition/notebook/#Loading-and-Splitting-the-Data","page":"Model Composition","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"To simulate an imbalanced dataset, we will take a random sample:","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Random.seed!(803429)\nsubset_indices = rand(1:size(X, 1), 100)\nX, y = X[subset_indices, :], y[subset_indices]\nImbalance.checkbalance(y)","category":"page"},{"location":"common_workflows/composition/notebook/#Instantiating-the-model","page":"Model Composition","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Let's load BorderlineSMOTE1 to oversample the data and Standardizer to standardize it.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance verbosity=0\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"We didn't need to load Standardizer because it is a local model for MLJ (see localmodels())","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"clf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"First we wrap the oversampler with the neural network via the BalancedModel construct. This comes from MLJBalancing And allows combining resampling methods with MLJ models in a sequential pipeline.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"oversampler = BorderlineSMOTE1(k=5, ratios=1.0, rng=42)\nbalanced_model = BalancedModel(model=clf, balancer1=oversampler)\nstandarizer = Standardizer()","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Now let's compose the balanced model with a standardizer.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"pipeline = standarizer |> balanced_model","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"By this, any training data will be standardized then oversampled then passed to the model. Meanwhile, for inference, the standardizer will automatically use the training set's mean and std and the oversampler will be transparent.","category":"page"},{"location":"common_workflows/composition/notebook/#Training-the-Composed-Model","page":"Model Composition","title":"Training the Composed Model","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"It's indistinguishable from training a single model.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"mach = machine(pipeline, X, y)\nfit!(mach)\ncv=CV(nfolds=5)\nevaluate!(mach, resampling=cv, measure=accuracy)","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Regression/","page":"Regression","title":"Regression","text":"MLJFlux.NeuralNetworkRegressor","category":"page"},{"location":"interface/Regression/#MLJFlux.NeuralNetworkRegressor","page":"Regression","title":"MLJFlux.NeuralNetworkRegressor","text":"NeuralNetworkRegressor\n\nA model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\n\nDo model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).\n\nNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increasing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a regression model for the Boston house price dataset.\n\nusing MLJ\nimport MLJFlux\nusing Flux\nimport Optimisers\n\nFirst, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:\n\ndata = OpenML.load(531); # Loads from https://www.openml.org/d/531\ny, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);\n\nscitype(y)\nschema(X)\n\nSince MLJFlux models do not handle ordered factors, we'll treat :RAD as Continuous:\n\nX = coerce(X, :RAD=>Continuous)\n\nSplitting off a test set:\n\n(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);\n\nNext, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features (which will be known at fit! time) and rng is a proxy for a RNG (which will be passed from the rng field of model defined below). We also have the parameter n_out which is the number of output features. As we are doing single target regression, the value passed will always be 1, but the builder we define will also work for MultitargetNeuralNetworkRegressor.\n\nbuilder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend\n\nInstantiating a model:\n\nNeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\nmodel = NeuralNetworkRegressor(\n builder=builder,\n rng=123,\n epochs=20\n)\n\nWe arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:\n\npipe = Standardizer |> TransformedTargetModel(model, transformer=Standardizer)\n\nIf we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach).\n\nmach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n# first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses\n\nExperimenting with learning rate\n\nWe can visually compare how the learning rate affects the predictions:\n\nusing Plots\n\nrates = rates = [5e-5, 1e-4, 0.005, 0.001, 0.05]\nplt=plot()\n\nforeach(rates) do η\n pipe.transformed_target_model_deterministic.model.optimiser = Optimisers.Adam(η)\n fit!(mach, force=true, verbosity=0)\n losses =\n report(mach).transformed_target_model_deterministic.model.training_losses[3:end]\n plot!(1:length(losses), losses, label=η)\nend\n\nplt\n\npipe.transformed_target_model_deterministic.model.optimiser.eta = Optimisers.Adam(0.0001)\n\nWith the learning rate fixed, we compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:\n\n# CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=l2)\n\n# loss for `(Xtest, test)`:\nfit!(mach) # train on `(X, y)`\nyhat = predict(mach, Xtest)\nl2(yhat, ytest)\n\nThese losses, for the pipeline model, refer to the target on the original, unstandardized, scale.\n\nFor implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.\n\nSee also MultitargetNeuralNetworkRegressor\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/comparison/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/comparison/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/comparison/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/comparison/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.Linear","category":"page"},{"location":"interface/Builders/#MLJFlux.Linear","page":"Builders","title":"MLJFlux.Linear","text":"Linear(; σ=Flux.relu)\n\nMLJFlux builder that constructs a fully connected two layer network with activation function σ. The number of input and output nodes is determined from the data. Weights are initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.Short","category":"page"},{"location":"interface/Builders/#MLJFlux.Short","page":"Builders","title":"MLJFlux.Short","text":"Short(; n_hidden=0, dropout=0.5, σ=Flux.sigmoid)\n\nMLJFlux builder that constructs a full-connected three-layer network using n_hidden nodes in the hidden layer and the specified dropout (defaulting to 0.5). An activation function σ is applied between the hidden and final layers. If n_hidden=0 (the default) then n_hidden is the geometric mean of the number of input and output nodes. The number of input and output nodes is determined from the data.\n\nEach layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.MLP","category":"page"},{"location":"interface/Builders/#MLJFlux.MLP","page":"Builders","title":"MLJFlux.MLP","text":"MLP(; hidden=(100,), σ=Flux.relu)\n\nMLJFlux builder that constructs a Multi-layer perceptron network. The ith element of hidden represents the number of neurons in the ith hidden layer. An activation function σ is applied between each layer.\n\nEach layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.@builder","category":"page"},{"location":"interface/Builders/#MLJFlux.@builder","page":"Builders","title":"MLJFlux.@builder","text":"@builder neural_net\n\nCreates a builder for neural_net. The variables rng, n_in, n_out and n_channels can be used to create builders for any random number generator rng, input and output sizes n_in and n_out and number of input channels n_channels.\n\nExamples\n\njulia> import MLJFlux: @builder;\n\njulia> nn = NeuralNetworkRegressor(builder = @builder(Chain(Dense(n_in, 64, relu),\n Dense(64, 32, relu),\n Dense(32, n_out))));\n\njulia> conv_builder = @builder begin\n front = Chain(Conv((3, 3), n_channels => 16), Flux.flatten)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n Chain(front, Dense(d, n_out));\n end\n\njulia> conv_nn = NeuralNetworkRegressor(builder = conv_builder);\n\n\n\n\n\n","category":"macro"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Neural-Architecture-Search-with-MLJFlux","page":"Neural Architecture Search","title":"Neural Architecture Search with MLJFlux","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Neural Architecture Search (NAS) is an instance of hyperparameter tuning concerned with tuning model hyperparameters defining the architecture itself. Although it's typically performed with sophisticated search algorithms for efficiency, in this example we will be using a simple random search.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Basic-Imports","page":"Neural Architecture Search","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nusing RDatasets: RDatasets # Dataset source\nusing DataFrames # To view tuning results in a table\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Loading-and-Splitting-the-Data","page":"Neural Architecture Search","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng = 123);\nX = Float32.(X); # To be compatible with type of network network parameters\nfirst(X, 5)","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Instantiating-the-model","page":"Neural Architecture Search","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Now let's construct our model. This follows a similar setup the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg = \"MLJFlux\"\nclf = NeuralNetworkClassifier(\n builder = MLJFlux.MLP(; hidden = (1, 1, 1), σ = Flux.relu),\n optimiser = Optimisers.ADAM(0.01),\n batch_size = 8,\n epochs = 10,\n rng = 42,\n)","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Generating-Network-Architectures","page":"Neural Architecture Search","title":"Generating Network Architectures","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"We know that the MLP builder takes a tuple of the form (z_1 z_2 z_k) to define a network with k hidden layers and where the ith layer has z_i neurons. We will proceed by defining a function that can generate all possible networks with a specific number of hidden layers, a minimum and maximum number of neurons per layer and increments to consider for the number of neurons.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"function generate_networks(\n ;min_neurons::Int,\n max_neurons::Int,\n neuron_step::Int,\n num_layers::Int,\n )\n # Define the range of neurons\n neuron_range = min_neurons:neuron_step:max_neurons\n\n # Empty list to store the network configurations\n networks = Vector{Tuple{Vararg{Int, num_layers}}}()\n\n # Recursive helper function to generate all combinations of tuples\n function generate_tuple(current_layers, remaining_layers)\n if remaining_layers > 0\n for n in neuron_range\n # current_layers =[] then current_layers=[(min_neurons)],\n # [(min_neurons+neuron_step)], [(min_neurons+2*neuron_step)],...\n # for each of these we call generate_layers again which appends\n # the n combinations for each one of them\n generate_tuple(vcat(current_layers, [n]), remaining_layers - 1)\n end\n else\n # in the base case, no more layers to \"recurse on\"\n # and we just append the current_layers as a tuple\n push!(networks, tuple(current_layers...))\n end\n end\n\n # Generate networks for the given number of layers\n generate_tuple([], num_layers)\n\n return networks\nend","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Now let's generate an array of all possible neural networks with three hidden layers and number of neurons per layer ∈ [1,64] with a step of 4","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"networks_space =\n generate_networks(\n min_neurons = 1,\n max_neurons = 64,\n neuron_step = 4,\n num_layers = 3,\n )\n\nnetworks_space[1:5]","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Wrapping-the-Model-for-Tuning","page":"Neural Architecture Search","title":"Wrapping the Model for Tuning","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Let's use this array to define the range of hyperparameters and pass it along with the model to the TunedModel constructor.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"r1 = range(clf, :(builder.hidden), values = networks_space)\n\ntuned_clf = TunedModel(\n model = clf,\n tuning = RandomSearch(),\n resampling = CV(nfolds = 4, rng = 42),\n range = [r1],\n measure = cross_entropy,\n n = 100, # searching over 100 random samples are enough\n);\nnothing #hide","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Performing-the-Search","page":"Neural Architecture Search","title":"Performing the Search","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Similar to the last workflow example, all we need now is to fit our model and the search will take place automatically:","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"mach = machine(tuned_clf, X, y);\nfit!(mach, verbosity = 0);\nfitted_params(mach).best_model","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Analyzing-the-Search-Results","page":"Neural Architecture Search","title":"Analyzing the Search Results","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Let's analyze the search results by converting the history array to a dataframe and viewing it:","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"history = report(mach).history\nhistory_df = DataFrame(\n mlp = [x[:model].builder for x in history],\n measurement = [x[:measurement][1] for x in history],\n)\nfirst(sort!(history_df, [order(:measurement)]), 10)","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Image Classification/","page":"Image Classification","title":"Image Classification","text":"MLJFlux.ImageClassifier","category":"page"},{"location":"interface/Image Classification/#MLJFlux.ImageClassifier","page":"Image Classification","title":"MLJFlux.ImageClassifier","text":"ImageClassifier\n\nA model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nImageClassifier = @load ImageClassifier pkg=MLJFlux\n\nDo model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).\n\nImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we use MLJFlux and a custom builder to classify the MNIST image dataset.\n\nusing MLJ\nusing Flux\nimport MLJFlux\nimport Optimisers\nimport MLJIteration # for `skip` control\n\nFirst we want to download the MNIST dataset, and unpack into images and labels:\n\nimport MLDatasets: MNIST\ndata = MNIST(split=:train)\nimages, labels = data.features, data.targets\n\nIn MLJ, integers cannot be used for encoding categorical data, so we must coerce them into the Multiclass scitype:\n\nlabels = coerce(labels, Multiclass);\n\nAbove images is a single array but MLJFlux requires the images to be a vector of individual image arrays:\n\nimages = coerce(images, GrayImage);\nimages[1]\n\nWe start by defining a suitable builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customizable.\n\nimport MLJFlux\n\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nmake2d(x::AbstractArray) = reshape(x, :, size(x)[end])\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) # padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n make2d)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend\n\nIt is important to note that in our build function, there is no final softmax. This is applied by default in all MLJFlux classifiers (override this using the finaliser hyperparameter).\n\nNow that our builder is defined, we can instantiate the actual MLJFlux model. If you have a GPU, you can substitute in acceleration=CUDALibs() below to speed up training.\n\nImageClassifier = @load ImageClassifier pkg=MLJFlux\nclf = ImageClassifier(builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123)\n\nYou can add Flux options such as optimiser and loss in the snippet above. Currently, loss must be a flux-compatible loss, and not an MLJ measure.\n\nNext, we can bind the model with the data in a machine, and train using the first 500 images:\n\nmach = machine(clf, images, labels);\nfit!(mach, rows=1:500, verbosity=2);\nreport(mach)\nchain = fitted_params(mach)\nFlux.params(chain)[2]\n\nWe can tack on 20 more epochs by modifying the epochs field, and iteratively fit some more:\n\nclf.epochs = clf.epochs + 20\nfit!(mach, rows=1:500, verbosity=2);\n\nWe can also make predictions and calculate an out-of-sample loss estimate, using any MLJ measure (loss/score):\n\npredicted_labels = predict(mach, rows=501:1000);\ncross_entropy(predicted_labels, labels[501:1000])\n\nThe preceding fit!/predict/evaluate workflow can be alternatively executed as follows:\n\nevaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)\n\nSee also NeuralNetworkClassifier.\n\n\n\n\n\n","category":"type"},{"location":"interface/Summary/#Models","page":"Summary","title":"Models","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"MLJFlux provides the model types below, for use with input features X and targets y of the scientific type indicated in the table below. The parameters n_in, n_out and n_channels refer to information passed to the builder, as described under Defining Custom Builders.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Model Type Prediction type scitype(X) <: _ scitype(y) <: _\nNeuralNetworkRegressor Deterministic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Continuous) (n_out = 1)\nMultitargetNeuralNetworkRegressor Deterministic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns <: Table(Continuous) with n_out columns\nNeuralNetworkClassifier Probabilistic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Finite} with n_out classes\nNeuralNetworkBinaryClassifier Probabilistic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Finite{2}} (but n_out = 1)\nImageClassifier Probabilistic AbstractVector(<:Image{W,H}) with n_in = (W, H) AbstractVector{<:Finite} with n_out classes","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
What exactly is a \"model\"?","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In MLJ a model is a mutable struct storing hyper-parameters for some learning algorithm indicated by the model name, and that's all. In particular, an MLJ model does not store learned parameters.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"warning: Difference in Definition\nIn Flux the term \"model\" has another meaning. However, as all Flux \"models\" used in MLJFLux are Flux.Chain objects, we call them chains, and restrict use of \"model\" to models in the MLJ sense.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
Are oberservations rows or columns?","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In MLJ the convention for two-dimensional data (tables and matrices) is rows=obervations. For matrices Flux has the opposite convention. If your data is a matrix with whose column index the observation index, then your optimal solution is to present the adjoint or transpose of your matrix to MLJFlux models. Otherwise, you can use the matrix as is, or transform one time with permutedims, and again present the adjoint or transpose as the optimal solution for MLJFlux training.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Instructions for coercing common image formats into some AbstractVector{<:Image} are here.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
Fitting and warm restarts","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"MLJ machines cache state enabling the \"warm restart\" of model training, as demonstrated in the incremental training example. In the case of MLJFlux models, fit!(mach) will use a warm restart if:","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"only model.epochs has changed since the last call; or\nonly model.epochs or model.optimiser have changed since the last call and model.optimiser_changes_trigger_retraining == false (the default) (the \"state\" part of the optimiser is ignored in this comparison). This allows one to dynamically modify learning rates, for example.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Here model=mach.model is the associated MLJ model.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"The warm restart feature makes it possible to externally control iteration. See, for example, Early Stopping with MLJFlux and Using MLJ to classifiy the MNIST image dataset.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/#Model-Hyperparameters.","page":"Summary","title":"Model Hyperparameters.","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"All models share the following hyper-parameters. See individual model docstrings for a full list.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Hyper-parameter Description Default\nbuilder Default builder for models. MLJFlux.Linear(σ=Flux.relu) (regressors) or MLJFlux.Short(n_hidden=0, dropout=0.5, σ=Flux.σ) (classifiers)\noptimiser The optimiser to use for training. Optimiser.Adam()\nloss The loss function used for training. Flux.mse (regressors) and Flux.crossentropy (classifiers)\nn_epochs Number of epochs to train for. 10\nbatch_size The batch size for the data. 1\nlambda The regularization strength. Range = [0, ∞). 0\nalpha The L2/L1 mix of regularization. Range = [0, 1]. 0\nrng The random number generator (RNG) passed to builders, for weight initialization, for example. Can be any AbstractRNG or the seed (integer) for a Xoshirio that is reset on every cold restart of model (machine) training. GLOBAL_RNG\nacceleration Use CUDALibs() for training on GPU; default is CPU1(). CPU1()\noptimiser_changes_trigger_retraining True if fitting an associated machine should trigger retraining from scratch whenever the optimiser changes. false","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"The classifiers have an additional hyperparameter finaliser (default is Flux.softmax, or Flux.σ in the binary case) which is the operation applied to the unnormalized output of the final layer to obtain probabilities (outputs summing to one). It should return a vector of the same length as its input.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"note: Loss Functions\nCurrently, the loss function specified by loss=... is applied internally by Flux and needs to conform to the Flux API. You cannot, for example, supply one of MLJ's probabilistic loss functions, such as MLJ.cross_entropy to one of the classifier constructors.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"That said, you can only use MLJ loss functions or metrics in evaluation meta-algorithms (such as cross validation) and they will work even if the underlying model comes from MLJFlux.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
More on accelerated training with GPUs","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"As in the table, when instantiating a model for training on a GPU, specify acceleration=CUDALibs(), as in","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"using MLJ\nImageClassifier = @load ImageClassifier\nmodel = ImageClassifier(epochs=10, acceleration=CUDALibs())\nmach = machine(model, X, y) |> fit!","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In this example, the data X, y is copied onto the GPU under the hood on the call to fit! and cached for use in any warm restart (see above). The Flux chain used in training is always copied back to the CPU at then conclusion of fit!, and made available as fitted_params(mach).","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/#Builders","page":"Summary","title":"Builders","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Builder Description\nMLJFlux.MLP(hidden=(10,)) General multi-layer perceptron\nMLJFlux.Short(n_hidden=0, dropout=0.5, σ=sigmoid) Fully connected network with one hidden layer and dropout\nMLJFlux.Linear(σ=relu) Vanilla linear network with no hidden layers and activation function σ\nMLJFlux.@builder Macro for customized builders\n ","category":"page"},{"location":"common_workflows/incremental_training/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/incremental_training/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/incremental_training/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/incremental_training/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"interface/Classification/","page":"Classification","title":"Classification","text":"MLJFlux.NeuralNetworkClassifier\nMLJFlux.NeuralNetworkBinaryClassifier","category":"page"},{"location":"interface/Classification/#MLJFlux.NeuralNetworkClassifier","page":"Classification","title":"MLJFlux.NeuralNetworkClassifier","text":"NeuralNetworkClassifier\n\nA model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nDo model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).\n\nNeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights.] Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.\n\nusing MLJ\nusing Flux\nimport RDatasets\nimport Optimisers\n\nFirst, we can load the data:\n\niris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123); # a vector and a table\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier()\n\nNext, we can train the model:\n\nmach = machine(clf, X, y)\nfit!(mach)\n\nWe can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:\n\nclf.optimiser = Optimisers.Adam(clf.optimiser.eta * 2)\nclf.epochs = clf.epochs + 5\n\nfit!(mach, verbosity=2) # trains 5 more epochs\n\nWe can inspect the mean training loss using the cross_entropy function:\n\ntraining_loss = cross_entropy(predict(mach, X), y)\n\nAnd we can access the Flux chain (model) using fitted_params:\n\nchain = fitted_params(mach).chain\n\nFinally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:\n\nr = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(clf, X, y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\")\n\n\nSee also ImageClassifier, NeuralNetworkBinaryClassifier.\n\n\n\n\n\n","category":"type"},{"location":"interface/Classification/#MLJFlux.NeuralNetworkBinaryClassifier","page":"Classification","title":"MLJFlux.NeuralNetworkBinaryClassifier","text":"NeuralNetworkBinaryClassifier\n\nA model type for constructing a neural network binary classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux\n\nDo model = NeuralNetworkBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkBinaryClassifier(builder=...).\n\nNeuralNetworkBinaryClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a binary (Multiclass{2} or OrderedFactor{2}) target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass{2} or OrderedFactor{2}; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.binarycrossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.binarycrossentropy: Standard binary classification loss, also known as the log loss.\nFlux.logitbinarycrossentropy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with σ and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default sigmoid finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.binary_focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.σ: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.σ.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.\n\nusing MLJ, Flux\nimport Optimisers\nimport RDatasets\n\nFirst, we can load the data:\n\nmtcars = RDatasets.dataset(\"datasets\", \"mtcars\");\ny, X = unpack(mtcars, ==(:VS), in([:MPG, :Cyl, :Disp, :HP, :WT, :QSec]));\n\nNote that y is a vector and X a table.\n\ny = categorical(y) # classifier takes catogorical input\nX_f32 = Float32.(X) # To match floating point type of the neural network layers\nNeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux\nbclf = NeuralNetworkBinaryClassifier()\n\nNext, we can train the model:\n\nmach = machine(bclf, X_f32, y)\nfit!(mach)\n\nWe can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:\n\njulia> bclf.optimiser\nAdam(0.001, (0.9, 0.999), 1.0e-8)\n\nbclf.optimiser = Optimisers.Adam(eta = bclf.optimiser.eta * 2)\nbclf.epochs = bclf.epochs + 5\n\nfit!(mach, verbosity=2) # trains 5 more epochs\n\nWe can inspect the mean training loss using the cross_entropy function:\n\ntraining_loss = cross_entropy(predict(mach, X_f32), y)\n\nAnd we can access the Flux chain (model) using fitted_params:\n\nchain = fitted_params(mach).chain\n\nFinally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:\n\nr = range(bclf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(\n bclf,\n X_f32,\n y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy,\n)\nusing Plots\nplot(\n curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\",\n)\n\n\nSee also ImageClassifier.\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/architecture_search/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/architecture_search/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/architecture_search/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/architecture_search/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/live_training/notebook/#Live-Training-with-MLJFlux","page":"Live Training","title":"Live Training with MLJFlux","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"This tutorial is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/live_training/notebook/#Basic-Imports","page":"Live Training","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"using MLJ\nusing Flux\nimport RDatasets\nimport Optimisers","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"using Plots","category":"page"},{"location":"common_workflows/live_training/notebook/#Loading-and-Splitting-the-Data","page":"Live Training","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/live_training/notebook/#Instantiating-the-model","page":"Live Training","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Now let's wrap this in an iterated model. We will use a callback that makes a plot for validation losses each iteration.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"stop_conditions = [\n Step(1), # Repeatedly train for one iteration\n NumberLimit(100), # Don't train for more than 100 iterations\n]\n\nvalidation_losses = []\ngr(reuse=true) # use the same window for plots\nfunction plot_loss(loss)\n push!(validation_losses, loss)\n display(plot(validation_losses, label=\"validation loss\", xlim=(1, 100)))\n sleep(.01) # to catch up with the plots while they are being generated\nend\n\ncallbacks = [ WithLossDo(plot_loss),]\n\niterated_model = IteratedModel(\n model=clf,\n resampling=Holdout(),\n measures=log_loss,\n iteration_parameter=:(epochs),\n controls=vcat(stop_conditions, callbacks),\n retrain=true,\n)","category":"page"},{"location":"common_workflows/live_training/notebook/#Live-Training","page":"Live Training","title":"Live Training","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Simply fitting the model is all we need","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"mach = machine(iterated_model, X, y)\nfit!(mach, force=true)","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"This page was generated using Literate.jl.","category":"page"},{"location":"#MLJFlux.jl","page":"Introduction","title":"MLJFlux.jl","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"A Julia package integrating deep learning Flux models with MLJ.","category":"page"},{"location":"#Objectives","page":"Introduction","title":"Objectives","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"Provide a user-friendly and high-level interface to fundamental Flux deep learning models while still being extensible by supporting custom models written with Flux\nMake building deep learning models more convenient to users already familiar with the MLJ workflow\nMake it easier to apply machine learning techniques provided by MLJ, including: out-of-sample performance evaluation, hyper-parameter optimization, iteration control, and more, to deep learning models","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"note: MLJFlux Scope\nMLJFlux support is focused on fundamental deep learning models for common supervised learning tasks. Sophisticated architectures and approaches, such as online learning, reinforcement learning, and adversarial networks, are currently outside its scope. Also, MLJFlux is limited to tasks where all (batches of) training data fits into memory.","category":"page"},{"location":"#Installation","page":"Introduction","title":"Installation","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"import Pkg\nPkg.activate(\"my_environment\", shared=true)\nPkg.add([\"MLJ\", \"MLJFlux\", \"Optimisers\", \"Flux\"])","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"You only need Flux if you need to build a custom architecture, or experiment with different loss or activation functions. Since MLJFlux 0.5, you must use optimisers from Optimisers.jl, as native Flux.jl optimisers are no longer supported. ","category":"page"},{"location":"#Quick-Start","page":"Introduction","title":"Quick Start","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"For the following demo, you will need to additionally run Pkg.add(\"RDatasets\").","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"using MLJ, Flux, MLJFlux\nimport RDatasets\nimport Optimisers\n\n# 1. Load Data\niris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n\n# 2. Load and instantiate model\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=\"MLJFlux\"\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=100, \n acceleration=CUDALibs() # For GPU support\n )\n\n# 3. Wrap it in a machine \nmach = machine(clf, X, y)\n\n# 4. Evaluate the model\ncv=CV(nfolds=5)\nevaluate!(mach, resampling=cv, measure=accuracy) ","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"As you can see we are able to use MLJ meta-functionality (i.e., cross validation) with a Flux deep learning model. All arguments provided have defaults.","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Notice that we are also able to define the neural network in a high-level fashion by only specifying the number of neurons in each hidden layer and the activation function. Meanwhile, MLJFlux is able to infer the input and output layer as well as use a suitable default for the loss function and output activation given the classification task. Notice as well that we did not need to manually implement a training or prediction loop.","category":"page"},{"location":"#Basic-idea:-\"builders\"-for-data-dependent-architecture","page":"Introduction","title":"Basic idea: \"builders\" for data-dependent architecture","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"As in the example above, any MLJFlux model has a builder hyperparameter, an object encoding instructions for creating a neural network given the data that the model eventually sees (e.g., the number of classes in a classification problem). While each MLJ model has a simple default builder, users may need to define custom builders to get optimal results (see Defining Custom Builders and this will require familiarity with the Flux API for defining a neural network chain.","category":"page"},{"location":"#Flux-or-MLJFlux?","page":"Introduction","title":"Flux or MLJFlux?","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"Flux is a deep learning framework in Julia that comes with everything you need to build deep learning models (i.e., GPU support, automatic differentiation, layers, activations, losses, optimizers, etc.). MLJFlux wraps models built with Flux which provides a more high-level interface for building and training such models. More importantly, it empowers Flux models by extending their support to many common machine learning workflows that are possible via MLJ such as:","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Estimating performance of your model using a holdout set or other resampling strategy (e.g., cross-validation) as measured by one or more metrics (e.g., loss functions) that may not have been used in training\nOptimizing hyper-parameters such as a regularization parameter (e.g., dropout) or a width/height/nchannnels of convolution layer\nCompose with other models such as introducing data pre-processing steps (e.g., missing data imputation) into a pipeline. It might make sense to include non-deep learning models in this pipeline. Other kinds of model composition could include blending predictions of a deep learner with some other kind of model (as in “model stacking”). Models composed with MLJ can be also tuned as a single unit.\nControlling iteration by adding an early stopping criterion based on an out-of-sample estimate of the loss, dynamically changing the learning rate (eg, cyclic learning rates), periodically save snapshots of the model, generate live plots of sample weights to judge training progress (as in tensor board)","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Comparing your model with a non-deep learning models","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"A comparable project, FastAI/FluxTraining, also provides a high-level interface for interacting with Flux models and supports a set of features that may overlap with (but not include all of) those supported by MLJFlux.","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Many of the features mentioned above are showcased in the workflow examples that you can access from the sidebar.","category":"page"},{"location":"interface/Custom Builders/#Defining-Custom-Builders","page":"Custom Builders","title":"Defining Custom Builders","text":"","category":"section"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Following is an example defining a new builder for creating a simple fully-connected neural network with two hidden layers, with n1 nodes in the first hidden layer, and n2 nodes in the second, for use in any of the first three models in Table 1. The definition includes one mutable struct and one method:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"mutable struct MyBuilder <: MLJFlux.Builder\n\tn1 :: Int\n\tn2 :: Int\nend\n\nfunction MLJFlux.build(nn::MyBuilder, rng, n_in, n_out)\n\tinit = Flux.glorot_uniform(rng)\n return Chain(\n Dense(n_in, nn.n1, init=init),\n Dense(nn.n1, nn.n2, init=init),\n Dense(nn.n2, n_out, init=init),\n )\nend","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Note here that n_in and n_out depend on the size of the data (see Table 1).","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"For a concrete image classification example, see Using MLJ to classifiy the MNIST image dataset.","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"More generally, defining a new builder means defining a new struct sub-typing MLJFlux.Builder and defining a new MLJFlux.build method with one of these signatures:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"MLJFlux.build(builder::MyBuilder, rng, n_in, n_out)\nMLJFlux.build(builder::MyBuilder, rng, n_in, n_out, n_channels) # for use with `ImageClassifier`","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"This method must return a Flux.Chain instance, chain, subject to the following conditions:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"chain(x) must make sense:\nfor any x <: Array{<:AbstractFloat, 2} of size (n_in, batch_size) where batch_size is any integer (for all models except ImageClassifier); or\nfor any x <: Array{<:Float32, 4} of size (W, H, n_channels, batch_size), where (W, H) = n_in, n_channels is 1 or 3, and batch_size is any integer (for use with ImageClassifier)\nThe object returned by chain(x) must be an AbstractFloat vector of length n_out.","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Alternatively, use MLJFlux.@builder(neural_net) to automatically create a builder for any valid Flux chain expression neural_net, where the symbols n_in, n_out, n_channels and rng can appear literally, with the interpretations explained above. For example,","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"builder = MLJFlux.@builder Chain(Dense(n_in, 128), Dense(128, n_out, tanh))","category":"page"}] +[{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Hyperparameter-Tuning-with-MLJFlux","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning with MLJFlux","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"In this workflow example we learn how to tune different hyperparameters of MLJFlux models with emphasis on training hyperparameters.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Basic-Imports","page":"Hyperparameter Tuning","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nusing Plots # To plot tuning results\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Loading-and-Splitting-the-Data","page":"Hyperparameter Tuning","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Instantiating-the-model","page":"Hyperparameter Tuning","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Now let's construct our model. This follows a similar setup the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=10,\n rng=42,\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Hyperparameter-Tuning-Example","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning Example","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Let's tune the batch size and the learning rate. We will use grid search and 5-fold cross-validation.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"We start by defining the hyperparameter ranges","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"r1 = range(clf, :batch_size, lower=1, upper=64)\netas = [10^x for x in range(-4, stop=0, length=4)]\noptimisers = [Optimisers.Adam(eta) for eta in etas]\nr2 = range(clf, :optimiser, values=optimisers)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then passing the ranges along with the model and other arguments to the TunedModel constructor.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"tuned_model = TunedModel(\n model=clf,\n tuning=Grid(goal=25),\n resampling=CV(nfolds=5, rng=42),\n range=[r1, r2],\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then wrapping our tuned model in a machine and fitting it.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"mach = machine(tuned_model, X, y);\nfit!(mach, verbosity=0);\nnothing #hide","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Let's check out the best performing model:","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"fitted_params(mach).best_model","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/#Learning-Curves","page":"Hyperparameter Tuning","title":"Learning Curves","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"With learning curves, it's possible to center our focus on the effects of a single hyperparameter of the model","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"First define the range and wrap it in a learning curve","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"r = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(\n clf,\n X,\n y,\n range=r,\n resampling=CV(nfolds=4, rng=42),\n measure=cross_entropy,\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"Then plot the curve","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"plot(\n curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\",\n)","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"","category":"page"},{"location":"common_workflows/hyperparameter_tuning/notebook/","page":"Hyperparameter Tuning","title":"Hyperparameter Tuning","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/comparison/notebook/#Model-Comparison-with-MLJFlux","page":"Model Comparison","title":"Model Comparison with MLJFlux","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"In this workflow example, we see how we can compare different machine learning models with a neural network from MLJFlux.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/comparison/notebook/#Basic-Imports","page":"Model Comparison","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nusing DataFrames # To visualize hyperparameter search results\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/comparison/notebook/#Loading-and-Splitting-the-Data","page":"Model Comparison","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/#Instantiating-the-models-Now-let's-construct-our-model.-This-follows-a-similar-setup","page":"Model Comparison","title":"Instantiating the models Now let's construct our model. This follows a similar setup","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf1 = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42\n )","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Let's as well load and construct three other classical machine learning models:","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"BayesianLDA = @load BayesianLDA pkg=MultivariateStats\nclf2 = BayesianLDA()\nRandomForestClassifier = @load RandomForestClassifier pkg=DecisionTree\nclf3 = RandomForestClassifier()\nXGBoostClassifier = @load XGBoostClassifier pkg=XGBoost\nclf4 = XGBoostClassifier();\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/#Wrapping-One-of-the-Models-in-a-TunedModel","page":"Model Comparison","title":"Wrapping One of the Models in a TunedModel","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Instead of just comparing with four models with the default/given hyperparameters, we will give XGBoostClassifier an unfair advantage By wrapping it in a TunedModel that considers the best learning rate η for the model.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"r1 = range(clf4, :eta, lower=0.01, upper=0.5, scale=:log10)\ntuned_model_xg = TunedModel(\n model=clf4,\n ranges=[r1],\n tuning=Grid(resolution=10),\n resampling=CV(nfolds=5, rng=42),\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Of course, one can wrap each of the four in a TunedModel if they are interested in comparing the models over a large set of their hyperparameters.","category":"page"},{"location":"common_workflows/comparison/notebook/#Comparing-the-models","page":"Model Comparison","title":"Comparing the models","text":"","category":"section"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"We simply pass the four models to the models argument of the TunedModel construct","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"tuned_model = TunedModel(\n models=[clf1, clf2, clf3, tuned_model_xg],\n tuning=Explicit(),\n resampling=CV(nfolds=5, rng=42),\n measure=cross_entropy,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Then wrapping our tuned model in a machine and fitting it.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"mach = machine(tuned_model, X, y);\nfit!(mach, verbosity=0);\nnothing #hide","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"Now let's see the history for more details on the performance for each of the models","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"history = report(mach).history\nhistory_df = DataFrame(\n mlp = [x[:model] for x in history],\n measurement = [x[:measurement][1] for x in history],\n)\nsort!(history_df, [order(:measurement)])","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This is Occam's razor in practice.","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"","category":"page"},{"location":"common_workflows/comparison/notebook/","page":"Model Comparison","title":"Model Comparison","text":"This page was generated using Literate.jl.","category":"page"},{"location":"contributing/#Adding-new-models-to-MLJFlux","page":"Contributing","title":"Adding new models to MLJFlux","text":"","category":"section"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"This section assumes familiarity with the MLJ model API","category":"page"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"If one subtypes a new model type as either MLJFlux.MLJFluxProbabilistic or MLJFlux.MLJFluxDeterministic, then instead of defining new methods for MLJModelInterface.fit and MLJModelInterface.update one can make use of fallbacks by implementing the lower level methods shape, build, and fitresult. See the classifier source code for an example.","category":"page"},{"location":"contributing/","page":"Contributing","title":"Contributing","text":"One still needs to implement a new predict method.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"extended_examples/spam_detection/notebook/#SMS-Spam-Detection-with-RNNs","page":"Spam Detection with RNNs","title":"SMS Spam Detection with RNNs","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"In this demo we use a custom RNN model from Flux with MLJFlux to classify text messages as spam or ham. We will be using the SMS Collection Dataset from Kaggle.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Warning. This demo includes some non-idiomatic use of MLJ to allow use of the Flux.jl Embedding layer. It is not recommended for MLJ beginners.","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Basic-Imports","page":"Spam Detection with RNNs","title":"Basic Imports","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"using MLJ\nusing MLJFlux\nusing Flux\nimport Optimisers # Flux.jl native optimisers no longer supported\nusing CSV # Read data\nusing DataFrames # Read data\nusing WordTokenizers # For tokenization\nusing Languages # For stop words","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Reading-Data","page":"Spam Detection with RNNs","title":"Reading Data","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"We assume the SMS Collection Dataset has been downloaded and is in a file called \"sms.csv\" in the same directory as the this script.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"df = CSV.read(joinpath(@__DIR__, \"sms.csv\"), DataFrame);\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Display the first 5 rows with DataFrames","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"first(df, 5)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Text-Preprocessing","page":"Spam Detection with RNNs","title":"Text Preprocessing","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Let's define a function that given an SMS message would:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Tokenize it (i.e., convert it into a vector of words)\nRemove stop words (i.e., words that are not useful for the analysis, like \"the\", \"a\", etc.)\nReturn the filtered vector of words","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"const STOP_WORDS = Languages.stopwords(Languages.English())\n\nfunction preprocess_text(text)\n # (1) Splitting texts into words (so later it can be a sequence of vectors)\n tokens = WordTokenizers.tokenize(text)\n\n # (2) Stop word removal\n filtered_tokens = filter(token -> !(token in STOP_WORDS), tokens)\n\n return filtered_tokens\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Define the vocabulary to be the set of all words in our training set. We also need a function that would map each word in a given sequence of words into its index in the dictionary (which is equivalent to representing the words as one-hot vectors).","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Now after we do this the sequences will all be numerical vectors but they will be of unequal length. Thus, to facilitate batching of data for the deep learning model, we need to decide on a specific maximum length for all sequences and:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"If a sequence is longer than the maximum length, we need to truncate it\nIf a sequence is shorter than the maximum length, we need to pad it with a new token","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Lastly, we must also handle the case that an incoming text sequence may involve words never seen in training by represent all such out-of-vocabulary words with a new token.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"We will define a function that would do this for us.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"function encode_and_equalize(text_seq, vocab_dict, max_length, pad_val, oov_val)\n # (1) encode using the vocabulary\n text_seq_inds = [get(vocab_dict, word, oov_val) for word in text_seq]\n\n # (2) truncate sequence if > max_length\n length(text_seq_inds) > max_length && (text_seq_inds = text_seq_inds[1:max_length])\n\n # (3) pad with pad_val\n text_seq_inds = vcat(text_seq_inds, fill(pad_val, max_length - length(text_seq_inds)))\n\n return text_seq_inds\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Preparing-Data","page":"Spam Detection with RNNs","title":"Preparing Data","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Splitting the data","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_data, y_data = unpack(df, ==(:Message), ==(:Category))\ny_data = coerce(y_data, Multiclass);\n\n(x_train, x_val), (y_train, y_val) = partition(\n (x_data, y_data),\n 0.8,\n multi = true,\n shuffle = true,\n rng = 42,\n);\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Now let's process the training and validation sets:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed = [preprocess_text(text) for text in x_train]\nx_val_processed = [preprocess_text(text) for text in x_val];\nnothing #hide","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"sanity check","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"println(x_train_processed[1], \" is \", y_data[1])","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Define the vocabulary from the training data","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"vocab = unique(vcat(x_train_processed...))\nvocab_dict = Dict(word => idx for (idx, word) in enumerate(vocab))\nvocab_size = length(vocab)\npad_val, oov_val = vocab_size + 1, vocab_size + 2\nmax_length = 12 # can choose this more smartly if you wish","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Encode and equalize training and validation data:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed_equalized = [\n encode_and_equalize(seq, vocab_dict, max_length, pad_val, oov_val) for\n seq in x_train_processed\n ]\nx_val_processed_equalized = [\n encode_and_equalize(seq, vocab_dict, max_length, pad_val, oov_val) for\n seq in x_val_processed\n ]\nx_train_processed_equalized[1:5] # all sequences are encoded and of the same length","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Convert both structures into matrix form:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"matrixify(v) = reduce(hcat, v)'\nx_train_processed_equalized_fixed = matrixify(x_train_processed_equalized)\nx_val_processed_equalized_fixed = matrixify(x_val_processed_equalized)\nsize(x_train_processed_equalized_fixed)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Instantiate-Model","page":"Spam Detection with RNNs","title":"Instantiate Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For the model, we will use a RNN from Flux. We will average the hidden states corresponding to any sequence then pass that to a dense layer for classification.","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For this, we need to define a custom Flux layer to perform the averaging operation:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"struct Mean end\nFlux.@layer Mean\n(m::Mean)(x) = mean(x, dims = 2)[:, 1, :] # [batch_size, seq_len, hidden_dim] => [batch_size, 1, hidden_dim]=> [batch_size, hidden_dim]","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"For compatibility, we will also define a layer that simply casts the input to integers as the embedding layer in Flux expects integers but the MLJFlux model expects floats:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"struct Intify end\nFlux.@layer Intify\n(m::Intify)(x) = Int.(x)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Here we define our network:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"builder = MLJFlux.@builder begin\n Chain(\n Intify(), # Cast input to integer\n Embedding(vocab_size + 2 => 300), # Embedding layer\n RNN(300, 50, tanh), # RNN layer\n Mean(), # Mean pooling layer\n Dense(50, 2), # Classification dense layer\n )\nend","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Notice that we used an embedding layer with input dimensionality vocab_size + 2 to take into account the padding and out-of-vocabulary tokens. Recall that the indices in our input correspond to one-hot-vectors and the embedding layer's purpose is to learn to map them into meaningful dense vectors (of dimensionality 300 here).","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Load and instantiate model","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg = MLJFlux\nclf = NeuralNetworkClassifier(\n builder = builder,\n optimiser = Optimisers.Adam(0.1),\n batch_size = 128,\n epochs = 10,\n)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Wrap it in a machine","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"x_train_processed_equalized_fixed = coerce(x_train_processed_equalized_fixed, Continuous)\nmach = machine(clf, x_train_processed_equalized_fixed, y_train)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Train-the-Model","page":"Spam Detection with RNNs","title":"Train the Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"fit!(mach)","category":"page"},{"location":"extended_examples/spam_detection/notebook/#Evaluate-the-Model","page":"Spam Detection with RNNs","title":"Evaluate the Model","text":"","category":"section"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"ŷ = predict_mode(mach, x_val_processed_equalized_fixed)\nbalanced_accuracy(ŷ, y_val)","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"Acceptable performance. Let's see some live examples:","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"using Random: Random;\nRandom.seed!(99);\n\nz = rand(x_val)\nz_processed = preprocess_text(z)\nz_encoded_equalized =\n encode_and_equalize(z_processed, vocab_dict, max_length, pad_val, oov_val)\nz_encoded_equalized_fixed = matrixify([z_encoded_equalized])\nz_encoded_equalized_fixed = coerce(z_encoded_equalized_fixed, Continuous)\nz_pred = predict_mode(mach, z_encoded_equalized_fixed)\n\nprint(\"SMS: `$(z)` and the prediction is `$(z_pred)`\")","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"","category":"page"},{"location":"extended_examples/spam_detection/notebook/","page":"Spam Detection with RNNs","title":"Spam Detection with RNNs","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/composition/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/composition/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/composition/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/composition/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Incremental-Training-with-MLJFlux","page":"Incremental Training","title":"Incremental Training with MLJFlux","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"In this workflow example we explore how to incrementally train MLJFlux models.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Basic-Imports","page":"Incremental Training","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Loading-and-Splitting-the-Data","page":"Incremental Training","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nX = Float32.(X) # To be compatible with type of network network parameters\n(X_train, X_test), (y_train, y_test) = partition(\n (X, y), 0.8,\n multi = true,\n shuffle = true,\n rng=42,\n);\nnothing #hide","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Instantiating-the-model","page":"Incremental Training","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=10,\n rng=42,\n)","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Initial-round-of-training","page":"Incremental Training","title":"Initial round of training","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's train the model. Calling fit! will automatically train it for 100 epochs as specified above.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"mach = machine(clf, X_train, y_train)\nfit!(mach)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Let's evaluate the training loss and validation accuracy","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_loss = cross_entropy(predict(mach, X_train), y_train)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"val_acc = accuracy(predict_mode(mach, X_test), y_test)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Poor performance it seems.","category":"page"},{"location":"common_workflows/incremental_training/notebook/#Incremental-Training","page":"Incremental Training","title":"Incremental Training","text":"","category":"section"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Now let's train it for another 30 epochs at half the original learning rate. All we need to do is changes these hyperparameters and call fit again. It won't reset the model parameters before training.","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"clf.optimiser = Optimisers.Adam(clf.optimiser.eta/2)\nclf.epochs = clf.epochs + 30\nfit!(mach, verbosity=2);\nnothing #hide","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"Let's evaluate the training loss and validation accuracy","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_loss = cross_entropy(predict(mach, X_train), y_train)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"training_acc = accuracy(predict_mode(mach, X_test), y_test)","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"That's much better. If we are rather interested in resetting the model parameters before fitting, we can do fit(mach, force=true).","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"","category":"page"},{"location":"common_workflows/incremental_training/notebook/","page":"Incremental Training","title":"Incremental Training","text":"This page was generated using Literate.jl.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"extended_examples/MNIST/notebook/#Using-MLJ-to-classifiy-the-MNIST-image-dataset","page":"MNIST Images","title":"Using MLJ to classifiy the MNIST image dataset","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"This tutorial is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"using MLJ\nusing Flux\nimport MLJFlux\nimport MLUtils\nimport MLJIteration # for `skip`","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"If running on a GPU, you will also need to import CUDA and import cuDNN.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"using Plots\ngr(size=(600, 300*(sqrt(5)-1)));\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Basic-training","page":"MNIST Images","title":"Basic training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Downloading the MNIST image dataset:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"import MLDatasets: MNIST\n\nENV[\"DATADEPS_ALWAYS_ACCEPT\"] = true\nimages, labels = MNIST(split=:train)[:];\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"In MLJ, integers cannot be used for encoding categorical data, so we must force the labels to have the Multiclass scientific type. For more on this, see Working with Categorical Data.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"labels = coerce(labels, Multiclass);\nimages = coerce(images, GrayImage);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Checking scientific types:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"@assert scitype(images) <: AbstractVector{<:Image}\n@assert scitype(labels) <: AbstractVector{<:Finite}","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Looks good.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For general instructions on coercing image data, see Type coercion for image data","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"images[1]","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"We start by defining a suitable Builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customisable.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"import MLJFlux\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) # padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n MLUtils.flatten)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Notes.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"There is no final softmax here, as this is applied by default in all MLJFLux classifiers. Customisation of this behaviour is controlled using using the finaliser hyperparameter of the classifier.\nInstead of calculating the padding p, Flux can infer the required padding in each dimension, which you enable by replacing pad = (p, p) with pad = SamePad().","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"We now define the MLJ model.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"ImageClassifier = @load ImageClassifier\nclf = ImageClassifier(\n builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123,\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"You can add Flux options optimiser=... and loss=... in the above constructor call. At present, loss must be a Flux-compatible loss, not an MLJ measure. To run on a GPU, add to the constructor acceleration=CUDALib() and omit rng.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For illustration purposes, we won't use all the data here:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"train = 1:500\ntest = 501:1000","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Binding the model with data in an MLJ machine:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach = machine(clf, images, labels);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Training for 10 epochs on the first 500 images:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"fit!(mach, rows=train, verbosity=2);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Inspecting:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"report(mach)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"chain = fitted_params(mach)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Flux.params(chain)[2]","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Adding 20 more epochs:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"clf.epochs = clf.epochs + 20\nfit!(mach, rows=train);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Computing an out-of-sample estimate of the loss:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"predicted_labels = predict(mach, rows=test);\ncross_entropy(predicted_labels, labels[test])","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Or to fit and predict, in one line:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"evaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)","category":"page"},{"location":"extended_examples/MNIST/notebook/#Wrapping-the-MLJFlux-model-with-iteration-controls","page":"MNIST Images","title":"Wrapping the MLJFlux model with iteration controls","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Any iterative MLJFlux model can be wrapped in iteration controls, as we demonstrate next. For more on MLJ's IteratedModel wrapper, see the MLJ documentation.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The \"self-iterating\" classifier, called iterated_clf below, is for iterating the image classifier defined above until one of the following stopping criterion apply:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Patience(3): 3 consecutive increases in the loss\nInvalidValue(): an out-of-sample loss, or a training loss, is NaN, Inf, or -Inf\nTimeLimit(t=5/60): training time has exceeded 5 minutes","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"These checks (and other controls) will be applied every two epochs (because of the Step(2) control). Additionally, training a machine bound to iterated_clf will:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"save a snapshot of the machine every three control cycles (every six epochs)\nrecord traces of the out-of-sample loss and training losses for plotting\nrecord mean value traces of each Flux parameter for plotting","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"For a complete list of controls, see this table.","category":"page"},{"location":"extended_examples/MNIST/notebook/#Wrapping-the-classifier","page":"MNIST Images","title":"Wrapping the classifier","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Some helpers","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To extract Flux params from an MLJFlux machine","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"parameters(mach) = vec.(Flux.params(fitted_params(mach)));\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To store the traces:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"losses = []\ntraining_losses = []\nparameter_means = Float32[];\nepochs = []","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"To update the traces:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"update_loss(loss) = push!(losses, loss)\nupdate_training_loss(losses) = push!(training_losses, losses[end])\nupdate_means(mach) = append!(parameter_means, mean.(parameters(mach)));\nupdate_epochs(epoch) = push!(epochs, epoch)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The controls to apply:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"save_control =\n MLJIteration.skip(Save(joinpath(tempdir(), \"mnist.jls\")), predicate=3)\n\ncontrols=[\n Step(2),\n Patience(3),\n InvalidValue(),\n TimeLimit(5/60),\n save_control,\n WithLossDo(),\n WithLossDo(update_loss),\n WithTrainingLossesDo(update_training_loss),\n Callback(update_means),\n WithIterationsDo(update_epochs),\n];\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"The \"self-iterating\" classifier:","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"iterated_clf = IteratedModel(\n clf,\n controls=controls,\n resampling=Holdout(fraction_train=0.7),\n measure=log_loss,\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/#Binding-the-wrapped-model-to-data:","page":"MNIST Images","title":"Binding the wrapped model to data:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach = machine(iterated_clf, images, labels);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Training","page":"MNIST Images","title":"Training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"fit!(mach, rows=train);\nnothing #hide","category":"page"},{"location":"extended_examples/MNIST/notebook/#Comparison-of-the-training-and-out-of-sample-losses:","page":"MNIST Images","title":"Comparison of the training and out-of-sample losses:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"plot(\n epochs,\n losses,\n xlab = \"epoch\",\n ylab = \"cross entropy\",\n label=\"out-of-sample\",\n)\nplot!(epochs, training_losses, label=\"training\")\n\nsavefig(joinpath(tempdir(), \"loss.png\"))","category":"page"},{"location":"extended_examples/MNIST/notebook/#Evolution-of-weights","page":"MNIST Images","title":"Evolution of weights","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"n_epochs = length(losses)\nn_parameters = div(length(parameter_means), n_epochs)\nparameter_means2 = reshape(copy(parameter_means), n_parameters, n_epochs)'\nplot(\n epochs,\n parameter_means2,\n title=\"Flux parameter mean weights\",\n xlab = \"epoch\",\n)","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Note. The higher the number in the plot legend, the deeper the layer we are **weight-averaging.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"savefig(joinpath(tempdir(), \"weights.png\"))","category":"page"},{"location":"extended_examples/MNIST/notebook/#Retrieving-a-snapshot-for-a-prediction:","page":"MNIST Images","title":"Retrieving a snapshot for a prediction:","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"mach2 = machine(joinpath(tempdir(), \"mnist3.jls\"))\npredict_mode(mach2, images[501:503])","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"3-element CategoricalArrays.CategoricalArray{Int64,1,UInt32}:\n 7\n 9\n 5","category":"page"},{"location":"extended_examples/MNIST/notebook/#Restarting-training","page":"MNIST Images","title":"Restarting training","text":"","category":"section"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"Mutating iterated_clf.controls or clf.epochs (which is otherwise ignored) will allow you to restart training from where it left off.","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"iterated_clf.controls[2] = Patience(4)\nfit!(mach, rows=train)\n\nplot(\n epochs,\n losses,\n xlab = \"epoch\",\n ylab = \"cross entropy\",\n label=\"out-of-sample\",\n)\nplot!(epochs, training_losses, label=\"training\")","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"","category":"page"},{"location":"extended_examples/MNIST/notebook/","page":"MNIST Images","title":"MNIST Images","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Multitarget Regression/","page":"Multi-Target Regression","title":"Multi-Target Regression","text":"MLJFlux.MultitargetNeuralNetworkRegressor","category":"page"},{"location":"interface/Multitarget Regression/#MLJFlux.MultitargetNeuralNetworkRegressor","page":"Multi-Target Regression","title":"MLJFlux.MultitargetNeuralNetworkRegressor","text":"MultitargetNeuralNetworkRegressor\n\nA model type for constructing a multitarget neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nMultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor pkg=MLJFlux\n\nDo model = MultitargetNeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetNeuralNetworkRegressor(builder=...).\n\nMultitargetNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a multi-valued Continuous target, represented as a table, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any table or matrix of output targets whose element scitype is Continuous; check column scitypes with schema(y). If y is a Matrix, it is assumed to have columns corresponding to variables and rows corresponding to observations.\n\nHyper-parameters\n\nbuilder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: Linear, Short, and MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew having the same scitype as X above. Predictions are deterministic.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we apply a multi-target regression model to synthetic data:\n\nusing MLJ\nimport MLJFlux\nusing Flux\nimport Optimisers\n\nFirst, we generate some synthetic data (needs MLJBase 0.20.16 or higher):\n\nX, y = make_regression(100, 9; n_targets = 2) # both tables\nschema(y)\nschema(X)\n\nSplitting off a test set:\n\n(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);\n\nNext, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features and n_out the number of target variables (both known at fit! time), while rng is a proxy for a RNG (which will be passed from the rng field of model defined below).\n\nbuilder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend\n\nInstantiating the regression model:\n\nMultitargetNeuralNetworkRegressor = @load MultitargetNeuralNetworkRegressor\nmodel = MultitargetNeuralNetworkRegressor(builder=builder, rng=123, epochs=20)\n\nWe will arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:\n\npipe = Standardizer |> TransformedTargetModel(model, transformer=Standardizer)\n\nIf we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach)\n\nmach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n# first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses\n\nFor experimenting with learning rate, see the NeuralNetworkRegressor example.\n\npipe.transformed_target_model_deterministic.model.optimiser = Optimisers.Adam(0.0001)\n\nWith the learning rate fixed, we can now compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:\n\n# custom MLJ loss:\nmulti_loss(yhat, y) = l2(MLJ.matrix(yhat), MLJ.matrix(y))\n\n# CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=multi_loss)\n\n# loss for `(Xtest, test)`:\nfit!(mach) # trains on all data `(X, y)`\nyhat = predict(mach, Xtest)\nmulti_loss(yhat, ytest)\n\nSee also NeuralNetworkRegressor\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Early-Stopping-with-MLJ","page":"Early Stopping","title":"Early Stopping with MLJ","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"In this workflow example, we learn how MLJFlux enables us to easily use early stopping when training MLJFlux models.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Basic-Imports","page":"Early Stopping","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nusing Plots # To visualize training\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Loading-and-Splitting-the-Data","page":"Early Stopping","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Instantiating-the-model-Now-let's-construct-our-model.-This-follows-a-similar-setup","page":"Early Stopping","title":"Instantiating the model Now let's construct our model. This follows a similar setup","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Wrapping-it-in-an-IteratedModel","page":"Early Stopping","title":"Wrapping it in an IteratedModel","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Let's start by defining the condition that can cause the model to early stop.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"stop_conditions = [\n Step(1), # Repeatedly train for one iteration\n NumberLimit(100), # Don't train for more than 100 iterations\n Patience(5), # Stop after 5 iterations of disimprovement in validation loss\n NumberSinceBest(9), # Or if the best loss occurred 9 iterations ago\n TimeLimit(30/60), # Or if 30 minutes passed\n]","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"We can also define callbacks. Here we want to store the validation loss for each iteration","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"validation_losses = []\ncallbacks = [\n WithLossDo(loss->push!(validation_losses, loss)),\n]","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"Construct the iterated model and pass to it the stop_conditions and the callbacks:","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"iterated_model = IteratedModel(\n model=clf,\n resampling=Holdout(fraction_train=0.7); # loss and stopping are based on out-of-sample\n measures=log_loss,\n iteration_parameter=:(epochs),\n controls=vcat(stop_conditions, callbacks),\n retrain=false # no need to retrain on all data at the end\n);\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"You can see more advanced stopping conditions as well as how to involve callbacks in the documentation","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Training-with-Early-Stopping","page":"Early Stopping","title":"Training with Early Stopping","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"At this point, all we need is to fit the model and iteration controls will be automatically handled","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"mach = machine(iterated_model, X, y)\nfit!(mach)\n# We can get the training losses like so\ntraining_losses = report(mach)[:model_report].training_losses;\nnothing #hide","category":"page"},{"location":"common_workflows/early_stopping/notebook/#Results","page":"Early Stopping","title":"Results","text":"","category":"section"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"We can see that the model converged after 100 iterations.","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"plot(training_losses, label=\"Training Loss\", linewidth=2)\nplot!(validation_losses, label=\"Validation Loss\", linewidth=2, size=(800,400))","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"","category":"page"},{"location":"common_workflows/early_stopping/notebook/","page":"Early Stopping","title":"Early Stopping","text":"This page was generated using Literate.jl.","category":"page"},{"location":"common_workflows/early_stopping/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/early_stopping/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/early_stopping/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/early_stopping/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"extended_examples/spam_detection/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"extended_examples/spam_detection/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"extended_examples/spam_detection/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"extended_examples/spam_detection/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/live_training/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/live_training/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/live_training/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/live_training/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/hyperparameter_tuning/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/hyperparameter_tuning/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/hyperparameter_tuning/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"extended_examples/MNIST/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"extended_examples/MNIST/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"extended_examples/MNIST/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"extended_examples/MNIST/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/composition/notebook/#Model-Composition-with-MLJFlux","page":"Model Composition","title":"Model Composition with MLJFlux","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"In this workflow example, we see how MLJFlux enables composing MLJ models with MLJFlux models. We will assume a class imbalance setting and wrap an oversampler with a deep learning model from MLJFlux.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/composition/notebook/#Basic-Imports","page":"Model Composition","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nimport RDatasets # Dataset source\nimport Random # To create imbalance\nimport Imbalance # To solve the imbalance\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/composition/notebook/#Loading-and-Splitting-the-Data","page":"Model Composition","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"To simulate an imbalanced dataset, we will take a random sample:","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Random.seed!(803429)\nsubset_indices = rand(1:size(X, 1), 100)\nX, y = X[subset_indices, :], y[subset_indices]\nImbalance.checkbalance(y)","category":"page"},{"location":"common_workflows/composition/notebook/#Instantiating-the-model","page":"Model Composition","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Let's load BorderlineSMOTE1 to oversample the data and Standardizer to standardize it.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance verbosity=0\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"We didn't need to load Standardizer because it is a local model for MLJ (see localmodels())","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"clf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"First we wrap the oversampler with the neural network via the BalancedModel construct. This comes from MLJBalancing And allows combining resampling methods with MLJ models in a sequential pipeline.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"oversampler = BorderlineSMOTE1(k=5, ratios=1.0, rng=42)\nbalanced_model = BalancedModel(model=clf, balancer1=oversampler)\nstandarizer = Standardizer()","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"Now let's compose the balanced model with a standardizer.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"pipeline = standarizer |> balanced_model","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"By this, any training data will be standardized then oversampled then passed to the model. Meanwhile, for inference, the standardizer will automatically use the training set's mean and std and the oversampler will be transparent.","category":"page"},{"location":"common_workflows/composition/notebook/#Training-the-Composed-Model","page":"Model Composition","title":"Training the Composed Model","text":"","category":"section"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"It's indistinguishable from training a single model.","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"mach = machine(pipeline, X, y)\nfit!(mach)\ncv=CV(nfolds=5)\nevaluate!(mach, resampling=cv, measure=accuracy)","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"","category":"page"},{"location":"common_workflows/composition/notebook/","page":"Model Composition","title":"Model Composition","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Regression/","page":"Regression","title":"Regression","text":"MLJFlux.NeuralNetworkRegressor","category":"page"},{"location":"interface/Regression/#MLJFlux.NeuralNetworkRegressor","page":"Regression","title":"MLJFlux.NeuralNetworkRegressor","text":"NeuralNetworkRegressor\n\nA model type for constructing a neural network regressor, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\n\nDo model = NeuralNetworkRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkRegressor(builder=...).\n\nNeuralNetworkRegressor is for training a data-dependent Flux.jl neural network to predict a Continuous target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Linear(σ=Flux.relu): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux documentation for more on builders, and the example below for using the @builder convenience macro.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.mse: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a regression task, natural loss functions are:\nFlux.mse\nFlux.mae\nFlux.msle\nFlux.huber_loss\nCurrently MLJ measures are not supported as loss functions here.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increasing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network.\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalized if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a regression model for the Boston house price dataset.\n\nusing MLJ\nimport MLJFlux\nusing Flux\nimport Optimisers\n\nFirst, we load in the data: The :MEDV column becomes the target vector y, and all remaining columns go into a table X, with the exception of :CHAS:\n\ndata = OpenML.load(531); # Loads from https://www.openml.org/d/531\ny, X = unpack(data, ==(:MEDV), !=(:CHAS); rng=123);\n\nscitype(y)\nschema(X)\n\nSince MLJFlux models do not handle ordered factors, we'll treat :RAD as Continuous:\n\nX = coerce(X, :RAD=>Continuous)\n\nSplitting off a test set:\n\n(X, Xtest), (y, ytest) = partition((X, y), 0.7, multi=true);\n\nNext, we can define a builder, making use of a convenience macro to do so. In the following @builder call, n_in is a proxy for the number input features (which will be known at fit! time) and rng is a proxy for a RNG (which will be passed from the rng field of model defined below). We also have the parameter n_out which is the number of output features. As we are doing single target regression, the value passed will always be 1, but the builder we define will also work for MultitargetNeuralNetworkRegressor.\n\nbuilder = MLJFlux.@builder begin\n init=Flux.glorot_uniform(rng)\n Chain(\n Dense(n_in, 64, relu, init=init),\n Dense(64, 32, relu, init=init),\n Dense(32, n_out, init=init),\n )\nend\n\nInstantiating a model:\n\nNeuralNetworkRegressor = @load NeuralNetworkRegressor pkg=MLJFlux\nmodel = NeuralNetworkRegressor(\n builder=builder,\n rng=123,\n epochs=20\n)\n\nWe arrange for standardization of the the target by wrapping our model in TransformedTargetModel, and standardization of the features by inserting the wrapped model in a pipeline:\n\npipe = Standardizer |> TransformedTargetModel(model, transformer=Standardizer)\n\nIf we fit with a high verbosity (>1), we will see the losses during training. We can also see the losses in the output of report(mach).\n\nmach = machine(pipe, X, y)\nfit!(mach, verbosity=2)\n\n# first element initial loss, 2:end per epoch training losses\nreport(mach).transformed_target_model_deterministic.model.training_losses\n\nExperimenting with learning rate\n\nWe can visually compare how the learning rate affects the predictions:\n\nusing Plots\n\nrates = rates = [5e-5, 1e-4, 0.005, 0.001, 0.05]\nplt=plot()\n\nforeach(rates) do η\n pipe.transformed_target_model_deterministic.model.optimiser = Optimisers.Adam(η)\n fit!(mach, force=true, verbosity=0)\n losses =\n report(mach).transformed_target_model_deterministic.model.training_losses[3:end]\n plot!(1:length(losses), losses, label=η)\nend\n\nplt\n\npipe.transformed_target_model_deterministic.model.optimiser.eta = Optimisers.Adam(0.0001)\n\nWith the learning rate fixed, we compute a CV estimate of the performance (using all data bound to mach) and compare this with performance on the test set:\n\n# CV estimate, based on `(X, y)`:\nevaluate!(mach, resampling=CV(nfolds=5), measure=l2)\n\n# loss for `(Xtest, test)`:\nfit!(mach) # train on `(X, y)`\nyhat = predict(mach, Xtest)\nl2(yhat, ytest)\n\nThese losses, for the pipeline model, refer to the target on the original, unstandardized, scale.\n\nFor implementing stopping criterion and other iteration controls, refer to examples linked from the MLJFlux documentation.\n\nSee also MultitargetNeuralNetworkRegressor\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/comparison/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/comparison/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/comparison/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/comparison/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.Linear","category":"page"},{"location":"interface/Builders/#MLJFlux.Linear","page":"Builders","title":"MLJFlux.Linear","text":"Linear(; σ=Flux.relu)\n\nMLJFlux builder that constructs a fully connected two layer network with activation function σ. The number of input and output nodes is determined from the data. Weights are initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.Short","category":"page"},{"location":"interface/Builders/#MLJFlux.Short","page":"Builders","title":"MLJFlux.Short","text":"Short(; n_hidden=0, dropout=0.5, σ=Flux.sigmoid)\n\nMLJFlux builder that constructs a full-connected three-layer network using n_hidden nodes in the hidden layer and the specified dropout (defaulting to 0.5). An activation function σ is applied between the hidden and final layers. If n_hidden=0 (the default) then n_hidden is the geometric mean of the number of input and output nodes. The number of input and output nodes is determined from the data.\n\nEach layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.MLP","category":"page"},{"location":"interface/Builders/#MLJFlux.MLP","page":"Builders","title":"MLJFlux.MLP","text":"MLP(; hidden=(100,), σ=Flux.relu)\n\nMLJFlux builder that constructs a Multi-layer perceptron network. The ith element of hidden represents the number of neurons in the ith hidden layer. An activation function σ is applied between each layer.\n\nEach layer is initialized using Flux.glorot_uniform(rng), where rng is inferred from the rng field of the MLJFlux model.\n\n\n\n\n\n","category":"type"},{"location":"interface/Builders/","page":"Builders","title":"Builders","text":"MLJFlux.@builder","category":"page"},{"location":"interface/Builders/#MLJFlux.@builder","page":"Builders","title":"MLJFlux.@builder","text":"@builder neural_net\n\nCreates a builder for neural_net. The variables rng, n_in, n_out and n_channels can be used to create builders for any random number generator rng, input and output sizes n_in and n_out and number of input channels n_channels.\n\nExamples\n\njulia> import MLJFlux: @builder;\n\njulia> nn = NeuralNetworkRegressor(builder = @builder(Chain(Dense(n_in, 64, relu),\n Dense(64, 32, relu),\n Dense(32, n_out))));\n\njulia> conv_builder = @builder begin\n front = Chain(Conv((3, 3), n_channels => 16), Flux.flatten)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n Chain(front, Dense(d, n_out));\n end\n\njulia> conv_nn = NeuralNetworkRegressor(builder = conv_builder);\n\n\n\n\n\n","category":"macro"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Neural-Architecture-Search-with-MLJFlux","page":"Neural Architecture Search","title":"Neural Architecture Search with MLJFlux","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Neural Architecture Search (NAS) is an instance of hyperparameter tuning concerned with tuning model hyperparameters defining the architecture itself. Although it's typically performed with sophisticated search algorithms for efficiency, in this example we will be using a simple random search.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Basic-Imports","page":"Neural Architecture Search","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"using MLJ # Has MLJFlux models\nusing Flux # For more flexibility\nusing RDatasets: RDatasets # Dataset source\nusing DataFrames # To view tuning results in a table\nimport Optimisers # native Flux.jl optimisers no longer supported","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Loading-and-Splitting-the-Data","page":"Neural Architecture Search","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng = 123);\nX = Float32.(X); # To be compatible with type of network network parameters\nfirst(X, 5)","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Instantiating-the-model","page":"Neural Architecture Search","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Now let's construct our model. This follows a similar setup the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg = \"MLJFlux\"\nclf = NeuralNetworkClassifier(\n builder = MLJFlux.MLP(; hidden = (1, 1, 1), σ = Flux.relu),\n optimiser = Optimisers.ADAM(0.01),\n batch_size = 8,\n epochs = 10,\n rng = 42,\n)","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Generating-Network-Architectures","page":"Neural Architecture Search","title":"Generating Network Architectures","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"We know that the MLP builder takes a tuple of the form (z_1 z_2 z_k) to define a network with k hidden layers and where the ith layer has z_i neurons. We will proceed by defining a function that can generate all possible networks with a specific number of hidden layers, a minimum and maximum number of neurons per layer and increments to consider for the number of neurons.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"function generate_networks(\n ;min_neurons::Int,\n max_neurons::Int,\n neuron_step::Int,\n num_layers::Int,\n )\n # Define the range of neurons\n neuron_range = min_neurons:neuron_step:max_neurons\n\n # Empty list to store the network configurations\n networks = Vector{Tuple{Vararg{Int, num_layers}}}()\n\n # Recursive helper function to generate all combinations of tuples\n function generate_tuple(current_layers, remaining_layers)\n if remaining_layers > 0\n for n in neuron_range\n # current_layers =[] then current_layers=[(min_neurons)],\n # [(min_neurons+neuron_step)], [(min_neurons+2*neuron_step)],...\n # for each of these we call generate_layers again which appends\n # the n combinations for each one of them\n generate_tuple(vcat(current_layers, [n]), remaining_layers - 1)\n end\n else\n # in the base case, no more layers to \"recurse on\"\n # and we just append the current_layers as a tuple\n push!(networks, tuple(current_layers...))\n end\n end\n\n # Generate networks for the given number of layers\n generate_tuple([], num_layers)\n\n return networks\nend","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Now let's generate an array of all possible neural networks with three hidden layers and number of neurons per layer ∈ [1,64] with a step of 4","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"networks_space =\n generate_networks(\n min_neurons = 1,\n max_neurons = 64,\n neuron_step = 4,\n num_layers = 3,\n )\n\nnetworks_space[1:5]","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Wrapping-the-Model-for-Tuning","page":"Neural Architecture Search","title":"Wrapping the Model for Tuning","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Let's use this array to define the range of hyperparameters and pass it along with the model to the TunedModel constructor.","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"r1 = range(clf, :(builder.hidden), values = networks_space)\n\ntuned_clf = TunedModel(\n model = clf,\n tuning = RandomSearch(),\n resampling = CV(nfolds = 4, rng = 42),\n range = [r1],\n measure = cross_entropy,\n n = 100, # searching over 100 random samples are enough\n);\nnothing #hide","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Performing-the-Search","page":"Neural Architecture Search","title":"Performing the Search","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Similar to the last workflow example, all we need now is to fit our model and the search will take place automatically:","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"mach = machine(tuned_clf, X, y);\nfit!(mach, verbosity = 0);\nfitted_params(mach).best_model","category":"page"},{"location":"common_workflows/architecture_search/notebook/#Analyzing-the-Search-Results","page":"Neural Architecture Search","title":"Analyzing the Search Results","text":"","category":"section"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"Let's analyze the search results by converting the history array to a dataframe and viewing it:","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"history = report(mach).history\nhistory_df = DataFrame(\n mlp = [x[:model].builder for x in history],\n measurement = [x[:measurement][1] for x in history],\n)\nfirst(sort!(history_df, [order(:measurement)]), 10)","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"","category":"page"},{"location":"common_workflows/architecture_search/notebook/","page":"Neural Architecture Search","title":"Neural Architecture Search","text":"This page was generated using Literate.jl.","category":"page"},{"location":"interface/Image Classification/","page":"Image Classification","title":"Image Classification","text":"MLJFlux.ImageClassifier","category":"page"},{"location":"interface/Image Classification/#MLJFlux.ImageClassifier","page":"Image Classification","title":"MLJFlux.ImageClassifier","text":"ImageClassifier\n\nA model type for constructing a image classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nImageClassifier = @load ImageClassifier pkg=MLJFlux\n\nDo model = ImageClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ImageClassifier(builder=...).\n\nImageClassifier classifies images using a neural network adapted to the type of images provided (color or gray scale). Predictions are probabilistic. Users provide a recipe for constructing the network, based on properties of the image encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is any AbstractVector of images with ColorImage or GrayImage scitype; check the scitype with scitype(X) and refer to ScientificTypes.jl documentation on coercing typical image formats into an appropriate type.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass; check the scitype with scitype(y).\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder: An MLJFlux builder that constructs the neural network. The fallback builds a depth-16 VGG architecture adapted to the image size and number of target classes, with no batch normalization; see the Metalhead.jl documentation for details. See the example below for a user-specified builder. A convenience macro @builder is also available. See also finaliser below.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and\nIncreassing batch size may accelerate training if acceleration=CUDALibs() and a\nGPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we use MLJFlux and a custom builder to classify the MNIST image dataset.\n\nusing MLJ\nusing Flux\nimport MLJFlux\nimport Optimisers\nimport MLJIteration # for `skip` control\n\nFirst we want to download the MNIST dataset, and unpack into images and labels:\n\nimport MLDatasets: MNIST\ndata = MNIST(split=:train)\nimages, labels = data.features, data.targets\n\nIn MLJ, integers cannot be used for encoding categorical data, so we must coerce them into the Multiclass scitype:\n\nlabels = coerce(labels, Multiclass);\n\nAbove images is a single array but MLJFlux requires the images to be a vector of individual image arrays:\n\nimages = coerce(images, GrayImage);\nimages[1]\n\nWe start by defining a suitable builder object. This is a recipe for building the neural network. Our builder will work for images of any (constant) size, whether they be color or black and white (ie, single or multi-channel). The architecture always consists of six alternating convolution and max-pool layers, and a final dense layer; the filter size and the number of channels after each convolution layer is customizable.\n\nimport MLJFlux\n\nstruct MyConvBuilder\n filter_size::Int\n channels1::Int\n channels2::Int\n channels3::Int\nend\n\nmake2d(x::AbstractArray) = reshape(x, :, size(x)[end])\n\nfunction MLJFlux.build(b::MyConvBuilder, rng, n_in, n_out, n_channels)\n k, c1, c2, c3 = b.filter_size, b.channels1, b.channels2, b.channels3\n mod(k, 2) == 1 || error(\"`filter_size` must be odd. \")\n p = div(k - 1, 2) # padding to preserve image size\n init = Flux.glorot_uniform(rng)\n front = Chain(\n Conv((k, k), n_channels => c1, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c1 => c2, pad=(p, p), relu, init=init),\n MaxPool((2, 2)),\n Conv((k, k), c2 => c3, pad=(p, p), relu, init=init),\n MaxPool((2 ,2)),\n make2d)\n d = Flux.outputsize(front, (n_in..., n_channels, 1)) |> first\n return Chain(front, Dense(d, n_out, init=init))\nend\n\nIt is important to note that in our build function, there is no final softmax. This is applied by default in all MLJFlux classifiers (override this using the finaliser hyperparameter).\n\nNow that our builder is defined, we can instantiate the actual MLJFlux model. If you have a GPU, you can substitute in acceleration=CUDALibs() below to speed up training.\n\nImageClassifier = @load ImageClassifier pkg=MLJFlux\nclf = ImageClassifier(builder=MyConvBuilder(3, 16, 32, 32),\n batch_size=50,\n epochs=10,\n rng=123)\n\nYou can add Flux options such as optimiser and loss in the snippet above. Currently, loss must be a flux-compatible loss, and not an MLJ measure.\n\nNext, we can bind the model with the data in a machine, and train using the first 500 images:\n\nmach = machine(clf, images, labels);\nfit!(mach, rows=1:500, verbosity=2);\nreport(mach)\nchain = fitted_params(mach)\nFlux.params(chain)[2]\n\nWe can tack on 20 more epochs by modifying the epochs field, and iteratively fit some more:\n\nclf.epochs = clf.epochs + 20\nfit!(mach, rows=1:500, verbosity=2);\n\nWe can also make predictions and calculate an out-of-sample loss estimate, using any MLJ measure (loss/score):\n\npredicted_labels = predict(mach, rows=501:1000);\ncross_entropy(predicted_labels, labels[501:1000])\n\nThe preceding fit!/predict/evaluate workflow can be alternatively executed as follows:\n\nevaluate!(mach,\n resampling=Holdout(fraction_train=0.5),\n measure=cross_entropy,\n rows=1:1000,\n verbosity=0)\n\nSee also NeuralNetworkClassifier.\n\n\n\n\n\n","category":"type"},{"location":"interface/Summary/#Models","page":"Summary","title":"Models","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"MLJFlux provides the model types below, for use with input features X and targets y of the scientific type indicated in the table below. The parameters n_in, n_out and n_channels refer to information passed to the builder, as described under Defining Custom Builders.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Model Type Prediction type scitype(X) <: _ scitype(y) <: _\nNeuralNetworkRegressor Deterministic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Continuous) (n_out = 1)\nMultitargetNeuralNetworkRegressor Deterministic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns <: Table(Continuous) with n_out columns\nNeuralNetworkClassifier Probabilistic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Finite} with n_out classes\nNeuralNetworkBinaryClassifier Probabilistic AbstractMatrix{Continuous} or Table(Continuous) with n_in columns AbstractVector{<:Finite{2}} (but n_out = 1)\nImageClassifier Probabilistic AbstractVector(<:Image{W,H}) with n_in = (W, H) AbstractVector{<:Finite} with n_out classes","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
What exactly is a \"model\"?","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In MLJ a model is a mutable struct storing hyper-parameters for some learning algorithm indicated by the model name, and that's all. In particular, an MLJ model does not store learned parameters.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"warning: Difference in Definition\nIn Flux the term \"model\" has another meaning. However, as all Flux \"models\" used in MLJFLux are Flux.Chain objects, we call them chains, and restrict use of \"model\" to models in the MLJ sense.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
Are oberservations rows or columns?","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In MLJ the convention for two-dimensional data (tables and matrices) is rows=obervations. For matrices Flux has the opposite convention. If your data is a matrix with whose column index the observation index, then your optimal solution is to present the adjoint or transpose of your matrix to MLJFlux models. Otherwise, you can use the matrix as is, or transform one time with permutedims, and again present the adjoint or transpose as the optimal solution for MLJFlux training.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Instructions for coercing common image formats into some AbstractVector{<:Image} are here.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
Fitting and warm restarts","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"MLJ machines cache state enabling the \"warm restart\" of model training, as demonstrated in the incremental training example. In the case of MLJFlux models, fit!(mach) will use a warm restart if:","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"only model.epochs has changed since the last call; or\nonly model.epochs or model.optimiser have changed since the last call and model.optimiser_changes_trigger_retraining == false (the default) (the \"state\" part of the optimiser is ignored in this comparison). This allows one to dynamically modify learning rates, for example.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Here model=mach.model is the associated MLJ model.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"The warm restart feature makes it possible to externally control iteration. See, for example, Early Stopping with MLJFlux and Using MLJ to classifiy the MNIST image dataset.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/#Model-Hyperparameters.","page":"Summary","title":"Model Hyperparameters.","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"All models share the following hyper-parameters. See individual model docstrings for a full list.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Hyper-parameter Description Default\nbuilder Default builder for models. MLJFlux.Linear(σ=Flux.relu) (regressors) or MLJFlux.Short(n_hidden=0, dropout=0.5, σ=Flux.σ) (classifiers)\noptimiser The optimiser to use for training. Optimiser.Adam()\nloss The loss function used for training. Flux.mse (regressors) and Flux.crossentropy (classifiers)\nn_epochs Number of epochs to train for. 10\nbatch_size The batch size for the data. 1\nlambda The regularization strength. Range = [0, ∞). 0\nalpha The L2/L1 mix of regularization. Range = [0, 1]. 0\nrng The random number generator (RNG) passed to builders, for weight initialization, for example. Can be any AbstractRNG or the seed (integer) for a Xoshirio that is reset on every cold restart of model (machine) training. GLOBAL_RNG\nacceleration Use CUDALibs() for training on GPU; default is CPU1(). CPU1()\noptimiser_changes_trigger_retraining True if fitting an associated machine should trigger retraining from scratch whenever the optimiser changes. false","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"The classifiers have an additional hyperparameter finaliser (default is Flux.softmax, or Flux.σ in the binary case) which is the operation applied to the unnormalized output of the final layer to obtain probabilities (outputs summing to one). It should return a vector of the same length as its input.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"note: Loss Functions\nCurrently, the loss function specified by loss=... is applied internally by Flux and needs to conform to the Flux API. You cannot, for example, supply one of MLJ's probabilistic loss functions, such as MLJ.cross_entropy to one of the classifier constructors.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"That said, you can only use MLJ loss functions or metrics in evaluation meta-algorithms (such as cross validation) and they will work even if the underlying model comes from MLJFlux.","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
More on accelerated training with GPUs","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"As in the table, when instantiating a model for training on a GPU, specify acceleration=CUDALibs(), as in","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"using MLJ\nImageClassifier = @load ImageClassifier\nmodel = ImageClassifier(epochs=10, acceleration=CUDALibs())\nmach = machine(model, X, y) |> fit!","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"In this example, the data X, y is copied onto the GPU under the hood on the call to fit! and cached for use in any warm restart (see above). The Flux chain used in training is always copied back to the CPU at then conclusion of fit!, and made available as fitted_params(mach).","category":"page"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"
","category":"page"},{"location":"interface/Summary/#Builders","page":"Summary","title":"Builders","text":"","category":"section"},{"location":"interface/Summary/","page":"Summary","title":"Summary","text":"Builder Description\nMLJFlux.MLP(hidden=(10,)) General multi-layer perceptron\nMLJFlux.Short(n_hidden=0, dropout=0.5, σ=sigmoid) Fully connected network with one hidden layer and dropout\nMLJFlux.Linear(σ=relu) Vanilla linear network with no hidden layers and activation function σ\nMLJFlux.@builder Macro for customized builders\n ","category":"page"},{"location":"common_workflows/incremental_training/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/incremental_training/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/incremental_training/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/incremental_training/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"interface/Classification/","page":"Classification","title":"Classification","text":"MLJFlux.NeuralNetworkClassifier\nMLJFlux.NeuralNetworkBinaryClassifier","category":"page"},{"location":"interface/Classification/#MLJFlux.NeuralNetworkClassifier","page":"Classification","title":"MLJFlux.NeuralNetworkClassifier","text":"NeuralNetworkClassifier\n\nA model type for constructing a neural network classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nDo model = NeuralNetworkClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkClassifier(builder=...).\n\nNeuralNetworkClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a Multiclass or OrderedFactor target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass or OrderedFactor; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Optimisers.Adam(): An Optimisers.jl optimiser. The optimiser performs the updating of the weights of the network. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.crossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.crossentropy: Standard multiclass classification loss, also known as the log loss.\nFlux.logitcrossentopy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with softmax and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default softmax finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights.] Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞). Note the history reports unpenalized losses.\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training. The default is Random.default_rng().\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.softmax: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.softmax.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.\n\nusing MLJ\nusing Flux\nimport RDatasets\nimport Optimisers\n\nFirst, we can load the data:\n\niris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123); # a vector and a table\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\nclf = NeuralNetworkClassifier()\n\nNext, we can train the model:\n\nmach = machine(clf, X, y)\nfit!(mach)\n\nWe can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:\n\nclf.optimiser = Optimisers.Adam(clf.optimiser.eta * 2)\nclf.epochs = clf.epochs + 5\n\nfit!(mach, verbosity=2) # trains 5 more epochs\n\nWe can inspect the mean training loss using the cross_entropy function:\n\ntraining_loss = cross_entropy(predict(mach, X), y)\n\nAnd we can access the Flux chain (model) using fitted_params:\n\nchain = fitted_params(mach).chain\n\nFinally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:\n\nr = range(clf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(clf, X, y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy)\nusing Plots\nplot(curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\")\n\n\nSee also ImageClassifier, NeuralNetworkBinaryClassifier.\n\n\n\n\n\n","category":"type"},{"location":"interface/Classification/#MLJFlux.NeuralNetworkBinaryClassifier","page":"Classification","title":"MLJFlux.NeuralNetworkBinaryClassifier","text":"NeuralNetworkBinaryClassifier\n\nA model type for constructing a neural network binary classifier, based on MLJFlux.jl, and implementing the MLJ model interface.\n\nFrom MLJ, the type can be imported using\n\nNeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux\n\nDo model = NeuralNetworkBinaryClassifier() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in NeuralNetworkBinaryClassifier(builder=...).\n\nNeuralNetworkBinaryClassifier is for training a data-dependent Flux.jl neural network for making probabilistic predictions of a binary (Multiclass{2} or OrderedFactor{2}) target, given a table of Continuous features. Users provide a recipe for constructing the network, based on properties of the data that is encountered, by specifying an appropriate builder. See MLJFlux documentation for more on builders.\n\nTraining data\n\nIn MLJ or MLJBase, bind an instance model to data with\n\nmach = machine(model, X, y)\n\nHere:\n\nX is either a Matrix or any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). If X is a Matrix, it is assumed to have columns corresponding to features and rows corresponding to observations.\ny is the target, which can be any AbstractVector whose element scitype is Multiclass{2} or OrderedFactor{2}; check the scitype with scitype(y)\n\nTrain the machine with fit!(mach, rows=...).\n\nHyper-parameters\n\nbuilder=MLJFlux.Short(): An MLJFlux builder that constructs a neural network. Possible builders include: MLJFlux.Linear, MLJFlux.Short, and MLJFlux.MLP. See MLJFlux.jl documentation for examples of user-defined builders. See also finaliser below.\noptimiser::Flux.Adam(): A Flux.Optimise optimiser. The optimiser performs the updating of the weights of the network. For further reference, see the Flux optimiser documentation. To choose a learning rate (the update rate of the optimizer), a good rule of thumb is to start out at 10e-3, and tune using powers of 10 between 1 and 1e-7.\nloss=Flux.binarycrossentropy: The loss function which the network will optimize. Should be a function which can be called in the form loss(yhat, y). Possible loss functions are listed in the Flux loss function documentation. For a classification task, the most natural loss functions are:\nFlux.binarycrossentropy: Standard binary classification loss, also known as the log loss.\nFlux.logitbinarycrossentropy: Mathematically equal to crossentropy, but numerically more stable than finalising the outputs with σ and then calculating crossentropy. You will need to specify finaliser=identity to remove MLJFlux's default sigmoid finaliser, and understand that the output of predict is then unnormalized (no longer probabilistic).\nFlux.tversky_loss: Used with imbalanced data to give more weight to false negatives.\nFlux.binary_focal_loss: Used with highly imbalanced data. Weights harder examples more than easier examples.\nCurrently MLJ measures are not supported values of loss.\nepochs::Int=10: The duration of training, in epochs. Typically, one epoch represents one pass through the complete the training dataset.\nbatch_size::int=1: the batch size to be used for training, representing the number of samples per update of the network weights. Typically, batch size is between 8 and 512. Increassing batch size may accelerate training if acceleration=CUDALibs() and a GPU is available.\nlambda::Float64=0: The strength of the weight regularization penalty. Can be any value in the range [0, ∞).\nalpha::Float64=0: The L2/L1 mix of regularization, in the range [0, 1]. A value of 0 represents L2 regularization, and a value of 1 represents L1 regularization.\nrng::Union{AbstractRNG, Int64}: The random number generator or seed used during training.\noptimizer_changes_trigger_retraining::Bool=false: Defines what happens when re-fitting a machine if the associated optimiser has changed. If true, the associated machine will retrain from scratch on fit! call, otherwise it will not.\nacceleration::AbstractResource=CPU1(): Defines on what hardware training is done. For Training on GPU, use CUDALibs().\nfinaliser=Flux.σ: The final activation function of the neural network (applied after the network defined by builder). Defaults to Flux.σ.\n\nOperations\n\npredict(mach, Xnew): return predictions of the target given new features Xnew, which should have the same scitype as X above. Predictions are probabilistic but uncalibrated.\npredict_mode(mach, Xnew): Return the modes of the probabilistic predictions returned above.\n\nFitted parameters\n\nThe fields of fitted_params(mach) are:\n\nchain: The trained \"chain\" (Flux.jl model), namely the series of layers, functions, and activations which make up the neural network. This includes the final layer specified by finaliser (eg, softmax).\n\nReport\n\nThe fields of report(mach) are:\n\ntraining_losses: A vector of training losses (penalised if lambda != 0) in historical order, of length epochs + 1. The first element is the pre-training loss.\n\nExamples\n\nIn this example we build a classification model using the Iris dataset. This is a very basic example, using a default builder and no standardization. For a more advanced illustration, see NeuralNetworkRegressor or ImageClassifier, and examples in the MLJFlux.jl documentation.\n\nusing MLJ, Flux\nimport Optimisers\nimport RDatasets\n\nFirst, we can load the data:\n\nmtcars = RDatasets.dataset(\"datasets\", \"mtcars\");\ny, X = unpack(mtcars, ==(:VS), in([:MPG, :Cyl, :Disp, :HP, :WT, :QSec]));\n\nNote that y is a vector and X a table.\n\ny = categorical(y) # classifier takes catogorical input\nX_f32 = Float32.(X) # To match floating point type of the neural network layers\nNeuralNetworkBinaryClassifier = @load NeuralNetworkBinaryClassifier pkg=MLJFlux\nbclf = NeuralNetworkBinaryClassifier()\n\nNext, we can train the model:\n\nmach = machine(bclf, X_f32, y)\nfit!(mach)\n\nWe can train the model in an incremental fashion, altering the learning rate as we go, provided optimizer_changes_trigger_retraining is false (the default). Here, we also change the number of (total) iterations:\n\njulia> bclf.optimiser\nAdam(0.001, (0.9, 0.999), 1.0e-8)\n\nbclf.optimiser = Optimisers.Adam(eta = bclf.optimiser.eta * 2)\nbclf.epochs = bclf.epochs + 5\n\nfit!(mach, verbosity=2) # trains 5 more epochs\n\nWe can inspect the mean training loss using the cross_entropy function:\n\ntraining_loss = cross_entropy(predict(mach, X_f32), y)\n\nAnd we can access the Flux chain (model) using fitted_params:\n\nchain = fitted_params(mach).chain\n\nFinally, we can see how the out-of-sample performance changes over time, using MLJ's learning_curve function:\n\nr = range(bclf, :epochs, lower=1, upper=200, scale=:log10)\ncurve = learning_curve(\n bclf,\n X_f32,\n y,\n range=r,\n resampling=Holdout(fraction_train=0.7),\n measure=cross_entropy,\n)\nusing Plots\nplot(\n curve.parameter_values,\n curve.measurements,\n xlab=curve.parameter_name,\n xscale=curve.parameter_scale,\n ylab = \"Cross Entropy\",\n)\n\n\nSee also ImageClassifier.\n\n\n\n\n\n","category":"type"},{"location":"common_workflows/architecture_search/README/#Contents","page":"Contents","title":"Contents","text":"","category":"section"},{"location":"common_workflows/architecture_search/README/","page":"Contents","title":"Contents","text":"file description\nnotebook.ipynb Juptyer notebook (executed)\nnotebook.unexecuted.ipynb Jupyter notebook (unexecuted)\nnotebook.md static markdown (included in MLJFlux.jl docs)\nnotebook.jl executable Julia script annotated with comments\ngenerate.jl maintainers only: execute to generate first 3 from 4th","category":"page"},{"location":"common_workflows/architecture_search/README/#Important","page":"Contents","title":"Important","text":"","category":"section"},{"location":"common_workflows/architecture_search/README/","page":"Contents","title":"Contents","text":"Scripts or notebooks in this folder cannot be reliably executed without the accompanying Manifest.toml and Project.toml files.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"EditURL = \"notebook.jl\"","category":"page"},{"location":"common_workflows/live_training/notebook/#Live-Training-with-MLJFlux","page":"Live Training","title":"Live Training with MLJFlux","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"This demonstration is available as a Jupyter notebook or julia script here.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Julia version is assumed to be 1.10.*","category":"page"},{"location":"common_workflows/live_training/notebook/#Basic-Imports","page":"Live Training","title":"Basic Imports","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"using MLJ\nusing Flux\nimport RDatasets\nimport Optimisers","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"using Plots","category":"page"},{"location":"common_workflows/live_training/notebook/#Loading-and-Splitting-the-Data","page":"Live Training","title":"Loading and Splitting the Data","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"iris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), rng=123);\nX = Float32.(X); # To be compatible with type of network network parameters\nnothing #hide","category":"page"},{"location":"common_workflows/live_training/notebook/#Instantiating-the-model","page":"Live Training","title":"Instantiating the model","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux\n\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=50,\n rng=42,\n)","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Now let's wrap this in an iterated model. We will use a callback that makes a plot for validation losses each iteration.","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"stop_conditions = [\n Step(1), # Repeatedly train for one iteration\n NumberLimit(100), # Don't train for more than 100 iterations\n]\n\nvalidation_losses = []\ngr(reuse=true) # use the same window for plots\nfunction plot_loss(loss)\n push!(validation_losses, loss)\n display(plot(validation_losses, label=\"validation loss\", xlim=(1, 100)))\n sleep(.01) # to catch up with the plots while they are being generated\nend\n\ncallbacks = [ WithLossDo(plot_loss),]\n\niterated_model = IteratedModel(\n model=clf,\n resampling=Holdout(),\n measures=log_loss,\n iteration_parameter=:(epochs),\n controls=vcat(stop_conditions, callbacks),\n retrain=true,\n)","category":"page"},{"location":"common_workflows/live_training/notebook/#Live-Training","page":"Live Training","title":"Live Training","text":"","category":"section"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"Simply fitting the model is all we need","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"mach = machine(iterated_model, X, y)\nfit!(mach, force=true)","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"","category":"page"},{"location":"common_workflows/live_training/notebook/","page":"Live Training","title":"Live Training","text":"This page was generated using Literate.jl.","category":"page"},{"location":"#MLJFlux.jl","page":"Introduction","title":"MLJFlux.jl","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"A Julia package integrating deep learning Flux models with MLJ.","category":"page"},{"location":"#Objectives","page":"Introduction","title":"Objectives","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"Provide a user-friendly and high-level interface to fundamental Flux deep learning models while still being extensible by supporting custom models written with Flux\nMake building deep learning models more convenient to users already familiar with the MLJ workflow\nMake it easier to apply machine learning techniques provided by MLJ, including: out-of-sample performance evaluation, hyper-parameter optimization, iteration control, and more, to deep learning models","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"note: MLJFlux Scope\nMLJFlux support is focused on fundamental deep learning models for common supervised learning tasks. Sophisticated architectures and approaches, such as online learning, reinforcement learning, and adversarial networks, are currently outside its scope. Also, MLJFlux is limited to tasks where all (batches of) training data fits into memory.","category":"page"},{"location":"#Installation","page":"Introduction","title":"Installation","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"import Pkg\nPkg.activate(\"my_environment\", shared=true)\nPkg.add([\"MLJ\", \"MLJFlux\", \"Optimisers\", \"Flux\"])","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"You only need Flux if you need to build a custom architecture, or experiment with different loss or activation functions. Since MLJFlux 0.5, you must use optimisers from Optimisers.jl, as native Flux.jl optimisers are no longer supported. ","category":"page"},{"location":"#Quick-Start","page":"Introduction","title":"Quick Start","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"For the following demo, you will need to additionally run Pkg.add(\"RDatasets\").","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"using MLJ, Flux, MLJFlux\nimport RDatasets\nimport Optimisers\n\n# 1. Load Data\niris = RDatasets.dataset(\"datasets\", \"iris\");\ny, X = unpack(iris, ==(:Species), colname -> true, rng=123);\n\n# 2. Load and instantiate model\nNeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=\"MLJFlux\"\nclf = NeuralNetworkClassifier(\n builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),\n optimiser=Optimisers.Adam(0.01),\n batch_size=8,\n epochs=100, \n acceleration=CUDALibs() # For GPU support\n )\n\n# 3. Wrap it in a machine \nmach = machine(clf, X, y)\n\n# 4. Evaluate the model\ncv=CV(nfolds=5)\nevaluate!(mach, resampling=cv, measure=accuracy) ","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"As you can see we are able to use MLJ meta-functionality (i.e., cross validation) with a Flux deep learning model. All arguments provided have defaults.","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Notice that we are also able to define the neural network in a high-level fashion by only specifying the number of neurons in each hidden layer and the activation function. Meanwhile, MLJFlux is able to infer the input and output layer as well as use a suitable default for the loss function and output activation given the classification task. Notice as well that we did not need to manually implement a training or prediction loop.","category":"page"},{"location":"#Basic-idea:-\"builders\"-for-data-dependent-architecture","page":"Introduction","title":"Basic idea: \"builders\" for data-dependent architecture","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"As in the example above, any MLJFlux model has a builder hyperparameter, an object encoding instructions for creating a neural network given the data that the model eventually sees (e.g., the number of classes in a classification problem). While each MLJ model has a simple default builder, users may need to define custom builders to get optimal results (see Defining Custom Builders and this will require familiarity with the Flux API for defining a neural network chain.","category":"page"},{"location":"#Flux-or-MLJFlux?","page":"Introduction","title":"Flux or MLJFlux?","text":"","category":"section"},{"location":"","page":"Introduction","title":"Introduction","text":"Flux is a deep learning framework in Julia that comes with everything you need to build deep learning models (i.e., GPU support, automatic differentiation, layers, activations, losses, optimizers, etc.). MLJFlux wraps models built with Flux which provides a more high-level interface for building and training such models. More importantly, it empowers Flux models by extending their support to many common machine learning workflows that are possible via MLJ such as:","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Estimating performance of your model using a holdout set or other resampling strategy (e.g., cross-validation) as measured by one or more metrics (e.g., loss functions) that may not have been used in training\nOptimizing hyper-parameters such as a regularization parameter (e.g., dropout) or a width/height/nchannnels of convolution layer\nCompose with other models such as introducing data pre-processing steps (e.g., missing data imputation) into a pipeline. It might make sense to include non-deep learning models in this pipeline. Other kinds of model composition could include blending predictions of a deep learner with some other kind of model (as in “model stacking”). Models composed with MLJ can be also tuned as a single unit.\nControlling iteration by adding an early stopping criterion based on an out-of-sample estimate of the loss, dynamically changing the learning rate (eg, cyclic learning rates), periodically save snapshots of the model, generate live plots of sample weights to judge training progress (as in tensor board)","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Comparing your model with a non-deep learning models","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"A comparable project, FastAI/FluxTraining, also provides a high-level interface for interacting with Flux models and supports a set of features that may overlap with (but not include all of) those supported by MLJFlux.","category":"page"},{"location":"","page":"Introduction","title":"Introduction","text":"Many of the features mentioned above are showcased in the workflow examples that you can access from the sidebar.","category":"page"},{"location":"interface/Custom Builders/#Defining-Custom-Builders","page":"Custom Builders","title":"Defining Custom Builders","text":"","category":"section"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Following is an example defining a new builder for creating a simple fully-connected neural network with two hidden layers, with n1 nodes in the first hidden layer, and n2 nodes in the second, for use in any of the first three models in Table 1. The definition includes one mutable struct and one method:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"mutable struct MyBuilder <: MLJFlux.Builder\n\tn1 :: Int\n\tn2 :: Int\nend\n\nfunction MLJFlux.build(nn::MyBuilder, rng, n_in, n_out)\n\tinit = Flux.glorot_uniform(rng)\n return Chain(\n Dense(n_in, nn.n1, init=init),\n Dense(nn.n1, nn.n2, init=init),\n Dense(nn.n2, n_out, init=init),\n )\nend","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Note here that n_in and n_out depend on the size of the data (see Table 1).","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"For a concrete image classification example, see Using MLJ to classifiy the MNIST image dataset.","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"More generally, defining a new builder means defining a new struct sub-typing MLJFlux.Builder and defining a new MLJFlux.build method with one of these signatures:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"MLJFlux.build(builder::MyBuilder, rng, n_in, n_out)\nMLJFlux.build(builder::MyBuilder, rng, n_in, n_out, n_channels) # for use with `ImageClassifier`","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"This method must return a Flux.Chain instance, chain, subject to the following conditions:","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"chain(x) must make sense:\nfor any x <: Array{<:AbstractFloat, 2} of size (n_in, batch_size) where batch_size is any integer (for all models except ImageClassifier); or\nfor any x <: Array{<:Float32, 4} of size (W, H, n_channels, batch_size), where (W, H) = n_in, n_channels is 1 or 3, and batch_size is any integer (for use with ImageClassifier)\nThe object returned by chain(x) must be an AbstractFloat vector of length n_out.","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"Alternatively, use MLJFlux.@builder(neural_net) to automatically create a builder for any valid Flux chain expression neural_net, where the symbols n_in, n_out, n_channels and rng can appear literally, with the interpretations explained above. For example,","category":"page"},{"location":"interface/Custom Builders/","page":"Custom Builders","title":"Custom Builders","text":"builder = MLJFlux.@builder Chain(Dense(n_in, 128), Dense(128, n_out, tanh))","category":"page"}] }