Add MXNet Backend (#59)

* Adding MXNet backend template. Adding all basic Variable and Tensor operations (#1) * add activation functions * add activation functions * fix some legacy * fix some legacy * cross entropy * cross entropy * fix name scoping introduced in 2.0 * fix name scoping introduced in 2.0 * Add dropout, l2_normalization, random_normal/uniform/binomial (#2) * remove the logic for hacking RNN * remove the logic for hacking RNN * add pooling with utils * add pooling with utils * minor * lint and name scope fix * fix access protected var * fix add neighbor, removed __eq__ in KerasSymbol * fix eval function, unittest for placeholder and variable * add unittests * fix bug * fix bug * fix * add some temporary fixes in mxnet backend. undo change to the pytest.ini * mxnet_backend graph fix, layer support (#3) * add activation functions * fix some legacy * cross entropy * fix name scoping introduced in 2.0 * Add dropout, l2_normalization, random_normal/uniform/binomial (#2) * remove the logic for hacking RNN * add pooling with utils * add activation functions * fix some legacy * cross entropy * fix name scoping introduced in 2.0 * remove the logic for hacking RNN * add pooling with utils * minor * lint and name scope fix * fix access protected var * fix add neighbor, removed __eq__ in KerasSymbol * fix eval function, unittest for placeholder and variable * add unittests * fix bug * fix bug * fix * add some temporary fixes in mxnet backend. undo change to the pytest.ini * Keras function not working is a known issue, add skip in the test * fix random_uniform/constant * fix legacy randomize methods * Fix MXNet backend operator bugs. Enabled Keras backend tests * add bias * Add Amazon copyrights to License (#6) * fix * fix * fix backend for mlp * fix context management, add optimizers * minor change * undo changes on example * fix eval * minor cleanup * fix some property usage * fixing AlphaDroupout, not finished yet * add mx model instantiate * modifies training model construct logic, fix some tests. fix reshape layer. * minor fix * fix bias_add * more fix on Dense and bias_add * In progress commit * fix comment * small fix * remove pytest.skip in conv3d. But it failed with theano backend in my workspace though. * Add conv2d and in_topk operator for mxnet backend (#11) * Skip BatchDot tests for Theano backend. (#12) * BatchDot, Basic Batchnorm, Fix BiasAdd, Fix Conv2D, CodeCleanup (#14) * Fix Conv2d shape issues and enable Conv2D UTs * Remove redundant mxnet only unit tests * Adding batch_dot, remove deconv, code comments and cleanup * Remove buggy conv1d implementation * Fix CR comments. Fix lint check issues * Move mxnet specific code from keras engine to mxnet_backend. (#15) * Move MXNet optimizers from keras optimizers to mxnet backend (#16) * Fix bug in reshape. Minor rename to avoid local conflicts * Bug fixes and enable/skip all Keras tests for mxnet backend (#21) * test results - 374 passed, 235 skipped in 114.44 seconds * fix/skip keras tests - tests/integration_tests, tests/keras/applications * fix/skip keras tests - tests/keras/engine/test_topology * fix/skip keras tests - tests/keras/engine/test_training * fix/skip keras tests - tests/keras/legacy/ * fix/skip keras tests - tests/keras/preprocessing * fix/skip keras tests - tests/keras/utils/ * Fix CR comments * Fix issues in zero_padding. Fix/Enable tests/layers/convolutional_test * Add momentum to batchnorm. Enable/skip tests in layers/core, local, merge, noise, normalization * Skip RNN tests in keras/tests/layers/recurrent_test, wrappers_test * Fix bug in spatial padding, enable/skip tests in loss,optimizers,callback,loss_weighting, model_saving * Fix mxnet backend multi-gpu training (#31) Fixing bug for mxnet backend to use multiple gpus. * Fix performance issue - Batchnormalization, Conv operator (#35) * Fix default axis for batchnorm layer for channels_first data_format * Performance improvement by avoiding kernel transpose in conv operation for channels_first format * Fix model - architecture, weights and both, load and save. (#36) * Prepare initial version of mxnet related documentation in keras (#38) * Skip failing unit tests for unsupported functionality in mxnet backend * Fix pep tests reported by CI * Use pytest module skip, revert kernel_shape logic * remove data_format param from bias_add API * Allow Predict() without compile for mxnet backend and enable tests. contributor - roywei@ * Fix bug - mxnet backend should not override keras config data_format to channels_first. Only warn of low performance * Conv3d() operator implementation for Keras2.0 using MXNet backend (#40) * conv3d implementation for keras2.0 as MXNet backend * conv3d implementation/testing for keras2.0 using MXNet backend * keeping -n option in pytest.ini file * fixed comments given by Sandeep * Add Conv1D support for MXNet backend (#44) * Add Conv1D support for MXNet backend * Fix CR comments * Conv2d transpose (#47) * add conv2d_transpose * conv2d transpose for both channels, enabled test case * add detailed comments and examples, fix style issue * enable test case in topology * Enable performance optimization for conv operators with MXNet backend. Make MXNet default backend with this branch (#48) * Fix conv kernel shape bug for TF backend. (#50) * Add support for keras multi_gpu_model() API with MXNet backend (#49) * Add support for keras multi_gpu_model() API with MXNet backend. Autoset GPU0 context on GPU machine * Fix typo * Add SAME padding mode support for pooling operator. (#51) * Add rnn() operator for MXNet backend with unrolling and masking feature (#46) * Adding rnn() operator in Keras2.0 with MXNet as backend with unroll=True and Masking=True/False and enabled relevant testcases. Also, modified couple of operators. * Modified comments * Added comments to a method * Enable categorical crossentropy testcases and made minor changes * Modified message * nit * Added detail description of handling variable length input in RNN * Skip conv2d_transpose and conv3d_transpose test-case for MXNet backend and minor changes in rnn() * Adamax and NAdam optimizer for MXNet backend (#54) * Add Adamax optimizer for MXNet backend * Fix lr and adamax params * Add Nadam optimizer for mxnet backend * Add Conv3d transpose (#52) * conv3d tranpose, enabled test case * update kernel shape * replace conv2d_transpse conv3d_transpose with convnd_transpose * update value errors with MXNet Backend info, fix typo * add check for conv3d transpose only supports gpu with cudnn * update context check * diable conv3d transpose test * fix typo in comment * Adding MXNet backend template. Adding all basic Variable and Tensor operations (#1) * add activation functions * add activation functions * fix some legacy * fix some legacy * cross entropy * cross entropy * fix name scoping introduced in 2.0 * fix name scoping introduced in 2.0 * Add dropout, l2_normalization, random_normal/uniform/binomial (#2) * remove the logic for hacking RNN * remove the logic for hacking RNN * add pooling with utils * add pooling with utils * minor * lint and name scope fix * fix access protected var * fix add neighbor, removed __eq__ in KerasSymbol * fix eval function, unittest for placeholder and variable * add unittests * fix bug * fix bug * fix * add some temporary fixes in mxnet backend. undo change to the pytest.ini * mxnet_backend graph fix, layer support (#3) * add activation functions * fix some legacy * cross entropy * fix name scoping introduced in 2.0 * Add dropout, l2_normalization, random_normal/uniform/binomial (#2) * remove the logic for hacking RNN * add pooling with utils * add activation functions * fix some legacy * cross entropy * fix name scoping introduced in 2.0 * remove the logic for hacking RNN * add pooling with utils * minor * lint and name scope fix * fix access protected var * fix add neighbor, removed __eq__ in KerasSymbol * fix eval function, unittest for placeholder and variable * add unittests * fix bug * fix bug * fix * add some temporary fixes in mxnet backend. undo change to the pytest.ini * Keras function not working is a known issue, add skip in the test * fix random_uniform/constant * fix legacy randomize methods * Fix MXNet backend operator bugs. Enabled Keras backend tests * add bias * Add Amazon copyrights to License (#6) * fix * fix * fix backend for mlp * fix context management, add optimizers * minor change * undo changes on example * fix eval * minor cleanup * fix some property usage * fixing AlphaDroupout, not finished yet * add mx model instantiate * modifies training model construct logic, fix some tests. fix reshape layer. * minor fix * fix bias_add * more fix on Dense and bias_add * In progress commit * fix comment * small fix * remove pytest.skip in conv3d. But it failed with theano backend in my workspace though. * Add conv2d and in_topk operator for mxnet backend (#11) * Skip BatchDot tests for Theano backend. (#12) * BatchDot, Basic Batchnorm, Fix BiasAdd, Fix Conv2D, CodeCleanup (#14) * Fix Conv2d shape issues and enable Conv2D UTs * Remove redundant mxnet only unit tests * Adding batch_dot, remove deconv, code comments and cleanup * Remove buggy conv1d implementation * Fix CR comments. Fix lint check issues * Move mxnet specific code from keras engine to mxnet_backend. (#15) * Move MXNet optimizers from keras optimizers to mxnet backend (#16) * Fix bug in reshape. Minor rename to avoid local conflicts * Bug fixes and enable/skip all Keras tests for mxnet backend (#21) * test results - 374 passed, 235 skipped in 114.44 seconds * fix/skip keras tests - tests/integration_tests, tests/keras/applications * fix/skip keras tests - tests/keras/engine/test_topology * fix/skip keras tests - tests/keras/engine/test_training * fix/skip keras tests - tests/keras/legacy/ * fix/skip keras tests - tests/keras/preprocessing * fix/skip keras tests - tests/keras/utils/ * Fix CR comments * Fix issues in zero_padding. Fix/Enable tests/layers/convolutional_test * Add momentum to batchnorm. Enable/skip tests in layers/core, local, merge, noise, normalization * Skip RNN tests in keras/tests/layers/recurrent_test, wrappers_test * Fix bug in spatial padding, enable/skip tests in loss,optimizers,callback,loss_weighting, model_saving * Fix mxnet backend multi-gpu training (#31) Fixing bug for mxnet backend to use multiple gpus. * Fix performance issue - Batchnormalization, Conv operator (#35) * Fix default axis for batchnorm layer for channels_first data_format * Performance improvement by avoiding kernel transpose in conv operation for channels_first format * Fix model - architecture, weights and both, load and save. (#36) * Prepare initial version of mxnet related documentation in keras (#38) * Skip failing unit tests for unsupported functionality in mxnet backend * Fix pep tests reported by CI * Use pytest module skip, revert kernel_shape logic * remove data_format param from bias_add API * Allow Predict() without compile for mxnet backend and enable tests. contributor - roywei@ * Fix bug - mxnet backend should not override keras config data_format to channels_first. Only warn of low performance * Conv3d() operator implementation for Keras2.0 using MXNet backend (#40) * conv3d implementation for keras2.0 as MXNet backend * conv3d implementation/testing for keras2.0 using MXNet backend * keeping -n option in pytest.ini file * fixed comments given by Sandeep * Add Conv1D support for MXNet backend (#44) * Add Conv1D support for MXNet backend * Fix CR comments * Conv2d transpose (#47) * add conv2d_transpose * conv2d transpose for both channels, enabled test case * add detailed comments and examples, fix style issue * enable test case in topology * Enable performance optimization for conv operators with MXNet backend. Make MXNet default backend with this branch (#48) * Fix conv kernel shape bug for TF backend. (#50) * Add support for keras multi_gpu_model() API with MXNet backend (#49) * Add support for keras multi_gpu_model() API with MXNet backend. Autoset GPU0 context on GPU machine * Fix typo * Add SAME padding mode support for pooling operator. (#51) * Add rnn() operator for MXNet backend with unrolling and masking feature (#46) * Adding rnn() operator in Keras2.0 with MXNet as backend with unroll=True and Masking=True/False and enabled relevant testcases. Also, modified couple of operators. * Modified comments * Added comments to a method * Enable categorical crossentropy testcases and made minor changes * Modified message * nit * Added detail description of handling variable length input in RNN * Skip conv2d_transpose and conv3d_transpose test-case for MXNet backend and minor changes in rnn() * Adamax and NAdam optimizer for MXNet backend (#54) * Add Adamax optimizer for MXNet backend * Fix lr and adamax params * Add Nadam optimizer for mxnet backend * Add Conv3d transpose (#52) * conv3d tranpose, enabled test case * update kernel shape * replace conv2d_transpse conv3d_transpose with convnd_transpose * update value errors with MXNet Backend info, fix typo * add check for conv3d transpose only supports gpu with cudnn * update context check * diable conv3d transpose test * fix typo in comment * Rebase to latest Keras - April 3, 2018 * Add build badges * Fix multi_gpu API bug for CPU. Fix PEP. (#64) * Fix multi_gpu API bug for CPU. Fix PEP. * fix embedding layer bug (#61) * fix embedding bug * addressed comments, enabled more test cases * add keras test * reduce line length * fix style, add blank lines * Benchmark (#55) * add conv2d_transpose * conv2d transpose for both channels, enabled test case * add detailed comments and examples, fix style issue * add benchmark scripts for resnet and imagenet data * combine scripts * fix args * fix num of gpus * update log * multi_gpu_model only support tf * add benchamrk scripts for synthetic data * update read me and scripts * add mxnet traing result table * update on readme * add cifar10 dataset and enable various resnet layers * fix compile for mxnet multiple gpu * update callbacks * update synthetic data script, add credits * undo new line * update readme, addressed pr comments * update readme * benchmark scripts style fix (#66) * style fix * remove unused import, fix line too long * adrressed pr comments * Added keras util API for conversion of data tensor from channels_last to channels_first using MXNet backend (#65) * Added keras util API for conversion of data tensor from channels_last to channels_first using MXNet backend * Modified comments * Addressed review comments and made the API more generic accross backends * Removed shape check * Modified comments * Added edge cases * moved helper method as nested * Added RNN benchmark scripts (#69) * Added RNN benchmark scripts * Fixed new line in bash script * Removed different backend code and modified comments * Removed spacing * Automated the wikiText2 download script * Added dataset_util functionality to have more flexible code * Added minor comments * modified minor comments * Fixed the multi-gpu context (#68) * Update benchmark result (#70) * update benchmark result * update result * simplify folder structure * add image result * add note * add note
awslabs · Apr 20, 2018 · 19ce757 · 19ce757
1 parent d673afd
commit 19ce757
Show file tree

Hide file tree

Showing 69 changed files with 7,166 additions and 153 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -21,6 +21,10 @@ matrix:
           env: KERAS_BACKEND=cntk PYTHONWARNINGS=ignore
         - python: 3.6
           env: KERAS_BACKEND=cntk PYTHONWARNINGS=ignore
+        - python: 2.7
+          env: KERAS_BACKEND=mxnet PYTHONWARNINGS=ignore
+        - python: 3.6
+          env: KERAS_BACKEND=mxnet PYTHONWARNINGS=ignore
 install:
   # code below is taken from http://conda.pydata.org/docs/travis.html
   # We do this conditionally because it saves us some downloading if the
@@ -38,7 +42,7 @@ install:
   # Useful for debugging any issues with conda
   - conda info -a
 
-  - conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION pytest pandas
+  - conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION nose scipy matplotlib pandas pytest h5py
   - source activate test-environment
   - pip install --only-binary=numpy,scipy numpy nose scipy matplotlib h5py theano
   - conda install mkl mkl-service
@@ -57,7 +61,11 @@ install:
 
   # install TensorFlow (CPU version).
   - pip install tensorflow
-
+
+  # install Apache MXNet (CPU version).
+  - pip install mxnet
+  - pip install --upgrade numpy
+
   # install cntk
   - if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
       pip install https://cntk.ai/PythonWheel/CPU-Only/cntk-2.3.1-cp27-cp27mu-linux_x86_64.whl;
@@ -78,6 +86,9 @@ install:
   - if [[ "$KERAS_BACKEND" != "cntk" ]]; then
       echo '    keras/backend/cntk_backend.py' >> .coveragerc;
     fi
+  - if [[ "$KERAS_BACKEND" != "mxnet" ]]; then
+      echo '    keras/backend/mxnet_backend.py' >> .coveragerc;
+    fi
 
   # detect whether core files are changed or not
   - export CORE_CHANGED=False;

diff --git a/LICENSE b/LICENSE
@@ -12,6 +12,14 @@ All contributions by Microsoft:
 Copyright (c) 2017 - 2018, Microsoft, Inc.
 All rights reserved.
 
+All contributions by Amazon:
+Copyright (c) 2017 Amazon.com, Inc. or its affiliates
+All rights reserved.
+
+All contributions by Amazon:
+Copyright (c) 2017 Amazon.com, Inc. or its affiliates
+All rights reserved.
+
 All other contributions:
 Copyright (c) 2015 - 2018, the respective contributors.
 All rights reserved.

diff --git a/README.md b/README.md
@@ -2,12 +2,15 @@
 
 ![Keras logo](https://s3.amazonaws.com/keras.io/img/keras-logo-2018-large-1200.png)
 
-[![Build Status](https://travis-ci.org/keras-team/keras.svg?branch=master)](https://travis-ci.org/keras-team/keras)
+| ubuntu/python-2.7 | ubuntu/python-3.5 |
+|---------|---------|
+| ![Python3 Build Status](https://codebuild.us-east-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoidHBzRFVlMG5SMGFQRTVzMUhxejNIK2dZRU1kb3p2c0JIbTVObDZtdDgxYThYdjRCZlg0RGF1eCsrSUtGQmgwYkFkZzJaT1BrdHpqcVJqcWE2aSt6QmRnPSIsIml2UGFyYW1ldGVyU3BlYyI6IklPMmRORld4TDYrdWNrWDciLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=master) | ![Python2 Build Status](https://codebuild.us-east-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoibHFOTlladW1VK050SFBST1N0UUtNOGdOV24vM25hVUJDQVVKNitvSFpXTFZ4RzlvUXppdHU4RytRR3hLdk1nSDd2VHlTSlZ5ZTlCUC9GdWdscHZRRFBNPSIsIml2UGFyYW1ldGVyU3BlYyI6IjZrQksycy9aWWV5QXh1MkoiLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=master) |
+
 [![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/keras-team/keras/blob/master/LICENSE)
 
 ## You have just found Keras.
 
-Keras is a high-level neural networks API, written in Python and capable of running on top of [TensorFlow](https://github.com/tensorflow/tensorflow), [CNTK](https://github.com/Microsoft/cntk), or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
+Keras is a high-level neural networks API, written in Python and capable of running on top of [TensorFlow](https://github.com/tensorflow/tensorflow), [CNTK](https://github.com/Microsoft/cntk), [Apache MXNet](https://github.com/apache/incubator-mxnet/), or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
 
 Use Keras if you need a deep learning library that:
 
@@ -117,6 +120,7 @@ Before installing Keras, please install one of its backend engines: TensorFlow,
 - [TensorFlow installation instructions](https://www.tensorflow.org/install/).
 - [Theano installation instructions](http://deeplearning.net/software/theano/install.html#install).
 - [CNTK installation instructions](https://docs.microsoft.com/en-us/cognitive-toolkit/setup-cntk-on-your-machine).
+- [MXNet installation instructions](http://mxnet.incubator.apache.org/install/index.html).
 
 You may also consider installing the following **optional dependencies**:
 
@@ -155,7 +159,7 @@ sudo python setup.py install
 ------------------
 
 
-## Using a different backend than TensorFlow
+## Switching from TensorFlow to CNTK, MXNet or Theano
 
 By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](https://keras.io/backend/) to configure the Keras backend.
 

diff --git a/benchmark/README.md b/benchmark/README.md
@@ -0,0 +1,131 @@
+# Keras Benchmarks
+
+## Overview
+The benchmark module aims to provide a performance comparison on different Keras backends using various models and 
+dataset on CPU, 1 GPU and multi-GPU machines.
+Currently supported backends: TensorFlow, Apache MXNet 
+
+## Setup
+To install MXNet backend refer to 
+[Installation](https://github.com/awslabs/keras-apache-mxnet/wiki/Installation#1-install-keras-with-apache-mxnet-backend)
+
+To switch between different backends refer to 
+[configure Keras backend](https://github.com/awslabs/keras-apache-mxnet/wiki/Installation#2-configure-keras-backend)
+
+## CNN Benchmarks
+We provide benchmark scripts to run on CIFAR-10, ImageNet and Synthetic Dataset(randomly generated)
+
+### CIFAR-10 Dataset
+[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset has 60000 32x32 color images in 10 classes.
+The [training scripts](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py)
+ will automatically download the dataset, you need to provide dataset name, resnet version 
+(1 or 2), number of layers (20, 56, or 110), number of GPUs to use. 
+
+Example Usage:
+
+`python benchmark_resnet.py --dataset cifar10 --version 1 --layers 56 --gpus 4`
+
+
+### ImageNet Dataset
+First, download ImageNet Dataset from [here](http://image-net.org/download), there are total 1.4 million images 
+with 1000 classes, each class is in a subfolder. In this script, each image is processed to size 256x256
+
+Since ImageNet Dataset is too large, there are two training mode for data that does not fit into memory: 
+[`train_on_batch`](https://keras.io/models/sequential/#train_on_batch) and 
+[`fit_generator`](https://keras.io/models/sequential/#fit_generator), 
+we recommend train_on_batch since it's more efficient on multi_gpu.
+(Refer to [Keras Document](https://keras.io/getting-started/faq/#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory) 
+and Keras Issue [#9502](https://github.com/keras-team/keras/issues/9502), 
+[#9204](https://github.com/keras-team/keras/issues/9204), [#9647](https://github.com/keras-team/keras/issues/9647))
+
+Compare to CIFAR-10, you need to provide additional params: training mode and path to imagenet dataset.
+
+Example usage:
+
+`python benchmark_resnet.py --dataset imagenet --mxnet_backend_training_speed.pngversion 1 -layers 56 --gpus 4 --train_mode train_on_batch --data_path home/ubuntu/imagenet/train/`
+
+### Synthetic Dataset
+We used benchmark scripts from 
+[TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) 
+official repo, and modified slightly for our use case.
+
+Directly run the shell script to launch the benchmark, provide one of the configurations in config.json and whether 
+you want to benchmark inference speed (True or False). 
+
+Example Usage:
+
+`sh run_<backend-type>_backend.sh gpu_config False`
+
+### CNN Benchmark Results
+Here we list the result of MXNet backend training speed on CIFAR-10, ImageNet and Synthetic Data using 
+ResNet50V1 model, on CPU, 1, 4, 8 GPUs using AWS instances. 
+Hardware specifications of the instances can be found [here](https://aws.amazon.com/ec2/instance-types/)
+
+For more detailed benchmark results, please refer to [CNN results](https://github.com/awslabs/keras-apache-mxnet/tree/keras2_mxnet_backend/benchmark/benchmark_result/CNN_result.md). 
+
+|||
+|  ------ | ------ |
+|  Keras Version | 2.1.5 |
+|  MXNet Version | 1.1.0 |
+|  Data Format | Channel first |
+
+|  Instance | GPU used | Package | CIFAR-10 | ImageNet | Synthetic Data |
+|  ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.18xLarge | 0  | mxnet-mkl | 87 | N/A | 9 |
+|  P3.8xLarge | 1 | mxnet-cu90 | N/A | 165 | 229 |
+|  P3.8xLarge | 4 | mxnet-cu90 | 1792 | 538 | 728 |
+|  P3.16xLarge | 8 | mxnet-cu90 | 1618 | 728 | 963 |
+
+![MXNet backend training speed](https://github.com/roywei/keras/blob/benchmark_result/benchmark/benchmark_result/mxnet_backend_training_speed.png)
+
+Note: X-axis is number of GPUs used, Y-axis is training speed(images/second)
+
+## RNN Benchmarks
+
+We provide benchmark scripts to run on Synthetic(randomly generated), Nietzsche, and WikiText-2 character level Dataset.
+
+Directly run the shell script to launch the benchmark, provide one of the configurations in config.json and whether you want to benchmark inference speed (True or False). 
+
+Example Usage:
+
+`sh run_<backend-type>_backend.sh gpu_config False`
+
+### Synthetic Dataset
+
+We used benchmark scripts from [TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) official repo, and modified slightly for our use case.
+
+### Nietzsche Dataset
+
+We have used an official Keras LSTM example scripts [lstm_text_generation.py](https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py), and modified slightly for our use case.
+
+### WikiText-2 Dataset
+
+We have used an official WikiText-2 character level Dataset from this [link](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset).
+
+The `lstm_text_generation_wikitext2.py` includes a dataset that is hosted on S3 bucket from this [link](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip) (This is a WikiText-2 raw character level data).
+
+### RNN Benchmark Results
+
+Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.xLarge(CPU) instance and P3.8xLarge(1, 4 GPUs) with MXNet backend. Batch size is 128. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/).
+
+| Instance   | GPUs | Data Set   | Speed/Epoch <br />(Lower is better) |
+| ---------- | ---- | ---------- | ----------------------------------- |
+| C5.xLarge  | 0    | Synthetic  | 91 sec - 2ms/step                   |
+| P3.8xLarge | 1    | Synthetic  | 13 sec - 264us/step                 |
+| P3.8xLarge | 4    | Synthetic  | 12 sec - 241us/step                 |
+| C5.xLarge  | 0    | Nietzsche  | 352 sec -  2ms/step                 |
+| P3.8xLarge | 1    | Nietzsche  | 53 sec - 265us/step                 |
+| P3.8xLarge | 4    | Nietzsche  | 47 sec - 236us/step                 |
+| C5.xLarge  | 0    | WikiText-2 | 6410 sec - 2ms/step                 |
+| P3.8xLarge | 1    | WikiText-2 | 882 sec - 264us/step                |
+| P3.8xLarge | 4    | WikiText-2 | 794 sec - 235us/step                |
+
+
+
+## Credits
+
+Synthetic Data scripts modified from 
+[TensorFlow Benchmarks](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks)
+
+## Reference
+[1] [TensorFlow Benchmarks](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks)
diff --git a/benchmark/__init__.py b/benchmark/__init__.py
diff --git a/benchmark/benchmark_result/CNN_result.md b/benchmark/benchmark_result/CNN_result.md
@@ -0,0 +1,92 @@
+# Detailed CNN Benchmark Results
+## CIFAR-10 Dataset
+### Configauration
+|||
+|---|---|
+|  Data Set | [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) |
+|  Keras Version | 2.1.5 |
+| TensorFlow Version | 1.7.0 |
+| MXNet Version | 1.1.0 |
+|  Training Method | [`fit`](https://keras.io/models/model/#fit) |
+|  Training Scripts | [Simple CNN Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/examples/CIFAR-10_cnn.py), [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py) |
+
+### Results
+
+|  Instance Type | GPU used | Model | Backend | Package | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.xLarge | 0  | Simple CNN | MXNet | mxnet-mkl | 32 | channel last | 253 |
+|  C5.xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel first | 223 |
+|  C5.xLarge | 0 | Simple CNN | TensorFlow | tensorflow | 32 | channel last | 309 |
+|  C5.xLarge | 0 | Simple CNN | TensorFlow | tensorflow | 32 | channel first | 101 |
+|  C5.18xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel last | 845 |
+|  C5.18xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel first | 936 |
+|  C5.18xLarge | 0 | ReNet50V1 | TensorFlow | tensorflow | 32 | channel last | 59 |
+|  C5.18xLarge | 0 | ReNet50V1 | TensorFlow | tensorflow | 32 | channel first | 41 |
+|  C5.18xLarge | 0 | ReNet50V1 | MXNet | mxnet-mkl |32 | channel last | 48 |
+|  C5.18xLarge | 0 | ReNet50V1 | MXNet | mxnet-mkl | 32 | channel first | 87 |
+|  P3.8xLarge | 4 | ReNet50V1 | TensorFlow | tensorflow-gpu |128 | channel last | 1020 |
+|  P3.8xLarge | 4 | ReNet50V1 | MXNet | mxnet-cu90 | 128 | channel first | 1792 |
+|  P3.8xLarge | 8 | ReNet50V1 | TensorFlow | tensorflow-gpu |256 | channel last | 962 |
+|  P3.16xLarge | 8 | ReNet50V1 | MXNet | mxnet-cu90 | 256 | channel first | 1618 |
+
+## ImageNet Dataset
+
+### Configuration
+|||
+|---|---|
+|  Data Set | [ImageNet](http://image-net.org) |
+| Model | ResNet50V1|
+|  Keras Version | 2.1.3 |
+| TensorFlow Version | 1.6.0rc1 |
+| MXNet Version | 1.1.0 |
+|  Training Method | [`train_on_batch`](https://keras.io/models/sequential/#train_on_batch), [`fit_generator`](https://keras.io/models/sequential/#fit_generator) |
+|  Training Scripts | [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py) |
+
+### Results
+
+|  Instance | GPU used | Backend | Package | Method | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  P3.8xLarge | 1 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 32 | channel last | 50 |
+|  P3.8xLarge | 1 |  MXNet | mxnet-cu90 | `train_on_batch` | 32 | channel first | 165 |
+|  P3.8xLarge | 4 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 128 | channel last | 162 |
+|  P3.8xLarge | 4 |  MXNet | mxnet-cu90 | `train_on_batch` | 128 | channel first | 538 |
+|  P3.16xLarge | 8 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 256 | channel last | 212 |
+|  P3.16xLarge | 8 |  MXNet | mxnet-cu90 | `train_on_batch` | 256 | channel first | 728 |
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | `fit_generator` | 32 | channel last | 53 |
+|  P3.8xLarge | 1 |  MXNet | mxnet-cu90 | `fit_generator` | 32 | channel first | 73 |
+|  P3.8xLarge | 4 |  TensorFlow | tensorflow-gpu | `fit_generator` | 128 | channel last | 173 |
+|  P3.8xLarge | 4 |  MXNet | mxnet-cu90 | `fit_generator` | 128 | channel first | 197  |
+
+## Synthetic Dataset
+
+### Configuration
+|||
+|---|---|
+|  Data Set | Random 256x256 color images, 1000 classes |
+| Model | ResNet50V1|
+|  Keras Version | 2.1.3 |
+| TensorFlow Version | 1.6.0rc1 |
+| MXNet Version | 1.1.0 |
+|  Training Method |[`fit`](https://keras.io/models/model/#fit) |
+|  Training Scripts | [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/tree/keras2_mxnet_backend/benchmark/synthetic) |
+
+### Results
+
+|  Instance | GPU used | Backend | Package | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.18xLarge | 0 |	TensorFlow|	tensorflow |32| channel first |4|
+|  C5.18xLarge |	0 |	MXNet	| mxnet-mkl	| 32 |	channel first|	9|
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | 32 | channel first | 198|
+|  P3.8xLarge | 1 | MXNet | mxnet-cu90 | 32 | channel first | 229 |
+|  P3.8xLarge | 4 | TensorFlow | tensorflow-gpu | 128 | channel first | 448 |
+|  P3.8xLarge | 4 | MXNet | mxnet-cu90 | 128 | channel first | 728 |
+|  P3.16xLarge | 8 | TensorFlow | tensorflow-gpu | 256 | channel first | 346 |
+|  P3.16xLarge | 8 | MXNet | mxnet-cu90 | 256 | channel first | 963 |
+|  C5.18xLarge | 0 |	TensorFlow|	tensorflow |32| channel last | 4 |
+|  C5.18xLarge | 0 |	MXNet	| mxnet-mkl	| 32 |	channel last | 3 |
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | 32 | channel last | 164|
+|  P3.8xLarge | 1 | MXNet | mxnet-cu90 | 32 | channel last | 18 |
+|  P3.8xLarge | 4 | TensorFlow | tensorflow-gpu | 128 | channel last | 409 |
+|  P3.8xLarge | 4 | MXNet | mxnet-cu90 | 128 | channel last | 73 |
+|  P3.16xLarge | 8 | TensorFlow | tensorflow-gpu | 256 | channel last | 164 |
+|  P3.16xLarge | 8 | MXNet | mxnet-cu90 | 256 | channel last | 18 |
diff --git a/benchmark/benchmark_result/mxnet_backend_training_speed.png b/benchmark/benchmark_result/mxnet_backend_training_speed.png
diff --git a/benchmark/scripts/__init__.py b/benchmark/scripts/__init__.py