Skip to content

experiments with CMA ES and neural networks

Emmanuel Benazera edited this page Jul 1, 2014 · 14 revisions

This is a set of on-going experiments with ANN and CMA-ES

The branch https://github.com/beniz/libcmaes/tree/nn_ex (nn_ex) contains code for experimenting with artificial neural networks (ANN) and CMA-ES.

At the moment, there's built-in support for two public datasets from machine learning competitions:

The goal of these experiments is to assess the scaling of CMA-ES algorithms, and the coupling of CMA-ES with gradient information (so-called 'solution injection' procedure). The code for both experiments lies in the examples/ repository. From these results, it might be possible to research various improvements to CMA-ES and related algorithms, along with improvements for scaling beyond current capabilities.

At this stage, these experiments cannot be expected to be competitive with state-of-the art solutions.

In order to proceed with either experiment, here are the steps to follow:

git checkout nn_ex
./configure
make
cd examples/

Note: the experiments below are voluntarily limited to a short number of optimization steps in order to yield fast reproducible examples

MNIST

In order to learn and test over the MNIST dataset, you first need to download the data, either from Kaggle website, either from http://juban.free.fr/stuff/datasets/mnist/train.tar.gz and unzip/untar it to the repository.

To train a single layer ANN with 100 hidden neurons over 9000 examples, testing it on 1000:

./nn_mnist -fdata train.csv -n 10000 -maxsolveiter 200 --with_gradient -testp 10 -hlayer 100

This uses 200 steps of sep-active-CMA-ES in 79510-D and uses one injected gradient solution at each step. The gradient is computed with standard back-propagation. It can be checked that back-propagation is bug-free:

./nn_mnist -fdata train.csv -check_grad

This should yield something similar to:

INFO - iter=200 / evals=7400 / f-value=0.879109 / sigma=0.0974141 / last_iter=13466
status: 9
cmat:
813   0  12  16   3  21  15   7   9   3
  2 928  14   4   1   2   4   4  34   0
 11  18 762  31  26  11  34  12  21  11
 30   4  35 693   2  42   6  12  60  16
  3   6   8  14 655   6  24  15  23 126
 67  32   6  78  10 505  20  18  65  13
 33   6  37  11  25  20 744   3   7   6
 19  12  27  14  23   8   5 740  15  72
 16  45  34  41   5  76   4   3 593  32
  5  11   6  33  76  14   4  71  34 647
training set:
precision=0.783659 / recall=0.782891
accuracy=0.786667
f1=0.783275
cmat:
87  0  1  0  0  3  0  1  0  0
 0 96  1  0  0  1  0  0  4  0
 1  1 81  9  1  3  2  5  4  1
 2  0  4 90  0  3  4  0  5  1
 2  5  1  0 61  0  2  1  6  9
 8  4  1  8  2 57  3  4  4  1
 1  1  2  1  2  4 98  1  0  1
 0  3  3  1  1  1  0 88  2  5
 1  2  7  8  0  7  1  1 70  4
 0  1  1  4  4  1  0  4  3 76
testing set:
precision=0.803087 / recall=0.801385
accuracy=0.804
f1=0.802235

Several other options are available, from plotting CMA-ES output, choosing among sigmoid and tanh (default) units, modifying sigma and lambda, there are listed with:

./nn_mnist --help

Higgs Challenge

Get the data either from Kaggle or from http://juban.free.fr/stuff/datasets/higgs/training.zip and unzip it to the repository.

To train a single layer ANN with 30 hidden neurons over 9000 examples, testing it over 1000:

./nn_higgs -fdata training.csv -n 10000 -maxsolveiter 500 -testp 10 -hlayer 30

This uses 500 steps of sep-active-CMA-ES to directly optimize the AMS objective function in 992-D, and no gradient information. Note that the algorithm maximizes the AMS by minimizing its negation instead.

This should report something like:

**** training set:
precision=0.714672 / recall=0.705301 / accuracy=0.745444 / f1=0.709956
max ams=6.18563 / ams=0.400734

**** testing set:
precision=0.70864 / recall=0.703793 / accuracy=0.741 / f1=0.706208
max ams=0.775471 / ams=0.125018

For the run above, the sep-active-CMA-ES internals can be plotted: Note that the AMS is being maximized, so the f-value increases over time.