Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand on skl_nn.MLPClassifier() syntax #55

Open
wants to merge 1 commit into
base: gh-pages
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion _episodes/07-neural-networks.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,9 @@ This is instead of writing a loop ourselves to divide every pixel by 255. Althou

Now we need to initialise a neural network. Scikit-Learn has an entire library for this (`sklearn.neural_network`) and the `MLPClassifier` class handles multi-layer perceptrons. This network takes a few parameters including the size of the hidden layer, the maximum number of training iterations we're going to allow, the exact algorithm to use, whether or not we'd like verbose output about what the training is doing, and the initial state of the random number generator.

In this example we specify a multi-layer perceptron with 50 hidden nodes, we allow a maximum of 50 iterations to train it, we turn on verbose output to see what's happening, and initialise the random state to 1 so that we always get the same behaviour.
In scikit-learn's `MLPClassifier`, the `hidden_layer_sizes` parameter specifies the number and size of hidden layers in the neural network. For example, `hidden_layer_sizes=(50,)` creates a single hidden layer with 50 neurons, while `(100, 50)` creates two hidden layers with 100 and 50 neurons, respectively. It’s important to include the trailing comma for a single hidden layer (e.g., `(50,)`) because without it, `(50)` would be interpreted as an integer, not a tuple, and cause an error. The example, `MLPClassifier(hidden_layer_sizes=(50,), max_iter=50, verbose=1, random_state=1)`, builds a neural network with one hidden layer containing 50 neurons, runs for a maximum of 50 iterations, logs training progress, and ensures reproducibility with `random_state=1`.

The max_iter parameter in MLPClassifier specifies the maximum number of iterations, not epochs. Since MLPClassifier uses stochastic gradient descent (or its variants), each iteration processes a small random subset of the data (a batch), and the full dataset may not be seen in a single iteration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example is the data batched? Given it's small size aren't we reading all of it into memory and processing it with each iteration?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the data is batched here. The data has 70,000 samples, and here's the note on MLPClassifer's batch size param: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

batch_size, default=’auto’. When set to “auto”, batch_size=min(200, n_samples).

It might actually be helpful to explicitly add a batch_size argument so the process is clearer for learners.


~~~
import sklearn.neural_network as skl_nn
Expand Down
Loading