-
Notifications
You must be signed in to change notification settings - Fork 80
experiments with CMA ES and neural networks
The branch https://github.com/beniz/libcmaes/tree/nn_ex (nn_ex) contains code for experimenting with artificial neural networks (ANN) and CMA-ES.
At the moment, there's built-in support for two public datasets from machine learning competitions:
- MNIST for handwritten digits recognition, http://www.kaggle.com/c/digit-recognizer
- Higgs boson challenge, https://www.kaggle.com/c/higgs-boson
The goal of these experiments is to assess the scaling of CMA-ES algorithms, and the coupling of CMA-ES with gradient information (so-called 'solution injection' procedure). The code for both experiments lies in the examples/ repository. From these results, it might be possible to research various improvements to CMA-ES and related algorithms, along with improvements for scaling beyond current capabilities.
At this stage, these experiments cannot be expected to be competitive with state-of-the art solutions.
In order to proceed with either experiment, here are the steps to follow:
git checkout nn_ex
./configure
make
cd examples/
Note: the experiments below are voluntarily limited to a short number of optimization steps in order to yield fast reproducible examples
In order to learn and test over the MNIST dataset, you first need to download the data, either from Kaggle website, either from http://juban.free.fr/stuff/datasets/mnist/train.tar.gz and unzip/untar it to the repository.
To train a single layer ANN with 100 hidden neurons over 9000 examples, testing it on 1000:
./nn_mnist -fdata train.csv -n 10000 -maxsolveiter 200 --with_gradient -testp 10 -hlayer 100
This uses 200 steps of sep-active-CMA-ES in 79510-D and uses one injected gradient solution at each step. The gradient is computed with standard back-propagation. It can be checked that back-propagation is bug-free:
./nn_mnist -fdata train.csv -check_grad
This should yield something similar to:
INFO - iter=200 / evals=7400 / f-value=0.879109 / sigma=0.0974141 / last_iter=13466
status: 9
cmat:
813 0 12 16 3 21 15 7 9 3
2 928 14 4 1 2 4 4 34 0
11 18 762 31 26 11 34 12 21 11
30 4 35 693 2 42 6 12 60 16
3 6 8 14 655 6 24 15 23 126
67 32 6 78 10 505 20 18 65 13
33 6 37 11 25 20 744 3 7 6
19 12 27 14 23 8 5 740 15 72
16 45 34 41 5 76 4 3 593 32
5 11 6 33 76 14 4 71 34 647
training set:
precision=0.783659 / recall=0.782891
accuracy=0.786667
f1=0.783275
cmat:
87 0 1 0 0 3 0 1 0 0
0 96 1 0 0 1 0 0 4 0
1 1 81 9 1 3 2 5 4 1
2 0 4 90 0 3 4 0 5 1
2 5 1 0 61 0 2 1 6 9
8 4 1 8 2 57 3 4 4 1
1 1 2 1 2 4 98 1 0 1
0 3 3 1 1 1 0 88 2 5
1 2 7 8 0 7 1 1 70 4
0 1 1 4 4 1 0 4 3 76
testing set:
precision=0.803087 / recall=0.801385
accuracy=0.804
f1=0.802235
Several other options are available, from plotting CMA-ES output, choosing among sigmoid and tanh (default) units, modifying sigma and lambda, there are listed with:
./nn_mnist --help
Get the data either from Kaggle or from http://juban.free.fr/stuff/datasets/higgs/training.zip and unzip it to the repository.
To train a single layer ANN with 30 hidden neurons over 9000 examples, testing it over 1000:
./nn_higgs -fdata training.csv -n 10000 -maxsolveiter 500 -testp 10 -hlayer 30
This uses 500 steps of sep-active-CMA-ES to directly optimize the AMS objective function in 992-D, and no gradient information. Note that the algorithm maximizes the AMS by minimizing its negation instead.
This should report something like:
**** training set:
precision=0.714672 / recall=0.705301 / accuracy=0.745444 / f1=0.709956
max ams=6.18563 / ams=0.400734
**** testing set:
precision=0.70864 / recall=0.703793 / accuracy=0.741 / f1=0.706208
max ams=0.775471 / ams=0.125018
For the run above, the sep-active-CMA-ES internals can be plotted: Note that the AMS is being maximized, so the f-value increases over time.