diff --git a/benchmark/README.md b/benchmark/README.md index f13689e3af9..2dc76aa38dd 100644 --- a/benchmark/README.md +++ b/benchmark/README.md @@ -94,50 +94,64 @@ Example Usage: We used benchmark scripts from [TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) official repo, and modified slightly for our use case. +Put `lstm_synthetic` as models parameter in `run__backend.sh` + ### Nietzsche Dataset We have used an official Keras LSTM example scripts [lstm_text_generation.py](https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py), and modified slightly for our use case. +Put `lstm_nietzsche` as models parameter in `run__backend.sh` + ### WikiText-2 Dataset We have used an official WikiText-2 character level Dataset from this [link](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset). The `lstm_text_generation.py` includes a dataset that is hosted on S3 bucket from this [link](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip) (This is a WikiText-2 raw character level data). +Put `lstm_wikitext2` as models parameter in `run__backend.sh` + ### RNN Benchmark Results -Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.18xLarge(CPU), C5.xLarge(CPU), and P3.8xLarge(1, 4 GPUs) instance with MXNet backend. Batch size is 128. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/). +Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.18xLarge(CPU), C5.xLarge(CPU), and P3.8xLarge(1, 4 GPUs) instance with MXNet and TensorFlow backend. Batch size is 128 and set `unroll=True` in Recurrent layer. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/). For more detailed benchmark results, please refer to [RNN results.](benchmark_result/RNN_result.md) | Framework/Library | Version | -| ----------------- | ------- | +| :----------------- | :------- | | Keras | 2.1.5 | | MXNet | 1.1.0 | +| TensorFlow | 1.7.0 | | CUDA | 9.0.176 | +#### 1. Synthetic Dataset + +![lstm_Synthetic_128.png](benchmark_result/lstm_Synthetic_128.png) + +| Instance | GPUs | MXNet Backend
Speed/Epoch | TensorFlow Backend
Speed/Epoch | +| :----------- | :---- | :------------------------------ | :----------------------------------- | +| C5.18xLarge | 0 | 24s 485us/step | 14s 284us/step | +| P3.8xLarge | 1 | 13s 261us/step | 12s 249us/step | +| P3.8xLarge | 4 | 12s 240us/step | 21s 430us/step | + +#### 2. Nietzsche Dataset +![lstm_Nietzsche_128.png](benchmark_result/lstm_Nietzsche_128.png) -| Instance | GPUs | Data Set | Speed/Epoch (Lower is better) | -| ----------- | ---- | ---------- | ----------------------------- | -| C5.18xLarge | 0 | Synthetic | 24s 485us/step | -| C5.xLarge | 0 | Synthetic | 93s 2ms/step | -| P3.8xLarge | 1 | Synthetic | 13s 261us/step | -| P3.8xLarge | 4 | Synthetic | 12s 240us/step | -| | | | | -| C5.18xLarge | 0 | Nietzsche | 78s 389us/step | -| C5.xLarge | 0 | Nietzsche | 360s 2ms/step | -| P3.8xLarge | 1 | Nietzsche | 52s 262us/step | -| P3.8xLarge | 4 | Nietzsche | 47s 235us/step | -| | | | | -| C5.18xLarge | 0 | WikiText-2 | 1345s 398us/step | -| C5.xLarge | 0 | WikiText-2 | 6417s 2ms/step | -| P3.8xLarge | 1 | WikiText-2 | 868s 257us/step | -| P3.8xLarge | 4 | WikiText-2 | 775s 229us/step | +| Instance | GPUs | MXNet Backend
Speed/Epoch | TensorFlow Backend
Speed/Epoch | +| :----------- | :---- | :------------------------------ | :----------------------------------- | +| C5.18xLarge | 0 | 78s 389us/step | 55s 273us/step | +| P3.8xLarge | 1 | 52s 262us/step | 51s 252us/step | +| P3.8xLarge | 4 | 47s 235us/step | 87s 435us/step | +#### 3. WikiText-2 Dataset -![rnn_mxnet_dataset](benchmark_result/rnn_mxnet_dataset.png) +![lstm_Wikitext2_128.png](benchmark_result/lstm_Wikitext2_128.png) +| Instance | GPUs | MXNet Backend
Speed/Epoch | TensorFlow Backend
Speed/Epoch | +| :----------- | :---- | :------------------------------ | :----------------------------------- | +| C5.18xLarge | 0 | 1345s 398us/step | 875s 259us/step | +| P3.8xLarge | 1 | 868s 257us/step | 817s 242us/step | +| P3.8xLarge | 4 | 775s 229us/step | 1468s 434us/step | ## Credits Synthetic Data scripts modified from diff --git a/benchmark/benchmark_result/MemoryConsumption.png b/benchmark/benchmark_result/MemoryConsumption.png new file mode 100644 index 00000000000..99f11347025 Binary files /dev/null and b/benchmark/benchmark_result/MemoryConsumption.png differ diff --git a/benchmark/benchmark_result/RNN_result.md b/benchmark/benchmark_result/RNN_result.md index 8bc22f7d680..921545d9025 100644 --- a/benchmark/benchmark_result/RNN_result.md +++ b/benchmark/benchmark_result/RNN_result.md @@ -1,5 +1,10 @@ # Detailed RNN Benchmark Results +Below is the result of GPU memory usage while running LSTM model on Synthetic, Nietzsche, and WikiText-2 character level dataset. + +![MemoryConsumption.png](MemoryConsumption.png) + +Note: All the data for performance diagram shown below is taken from the cell having `unroll Type=True` ## Synthetic Dataset ### Configuration @@ -17,6 +22,11 @@ ### Results +| | | +| :------------------------------------------------------- | :----------------------------------------------------------- | +| ![lstm_Synthetic_32.png](lstm_Synthetic_32.png) | ![lstm_Synthetic_128.png](lstm_Synthetic_128.png) | + + | Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) | | ----------- | ---- | ---------- | ---------- | --------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- | | C5.18xLarge | 0 | MXNet | 32 | Synthetic | fit() | 50s 1ms/step | TRUE | 50000 | 0 | @@ -65,6 +75,11 @@ ### Results +| | | +| :------------------------------------------------------- | :----------------------------------------------------------- | +| ![lstm_Nietzsche_32.png](lstm_Nietzsche_32.png) | ![lstm_Nietzsche_128.png](lstm_Nietzsche_128.png) | + + | Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) | | ----------- | ---- | ---------- | ---------- | --------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- | | C5.18xLarge | 0 | MXNet | 32 | Nietzsche | fit() | 226s 1ms/step | TRUE | 200285 | 0 | @@ -113,6 +128,11 @@ ### Results +| | | +| :------------------------------------------------------- | :----------------------------------------------------------- | +| ![lstm_Wikitext2_32.png](lstm_Wikitext2_32.png) | ![lstm_Wikitext2_128.png](lstm_Wikitext2_128.png) | + + | Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) | | ----------- | ---- | ---------- | ---------- | ---------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- | | C5.18xLarge | 0 | MXNet | 32 | WikiText-2 | fit() | 3530s 1ms/step | TRUE | 1562175 | 0 | diff --git a/benchmark/benchmark_result/lstm_Nietzsche_128.png b/benchmark/benchmark_result/lstm_Nietzsche_128.png new file mode 100644 index 00000000000..38557278c5e Binary files /dev/null and b/benchmark/benchmark_result/lstm_Nietzsche_128.png differ diff --git a/benchmark/benchmark_result/lstm_Nietzsche_32.png b/benchmark/benchmark_result/lstm_Nietzsche_32.png new file mode 100644 index 00000000000..3e5a3e50413 Binary files /dev/null and b/benchmark/benchmark_result/lstm_Nietzsche_32.png differ diff --git a/benchmark/benchmark_result/lstm_Synthetic_128.png b/benchmark/benchmark_result/lstm_Synthetic_128.png new file mode 100644 index 00000000000..d6fbbf553a9 Binary files /dev/null and b/benchmark/benchmark_result/lstm_Synthetic_128.png differ diff --git a/benchmark/benchmark_result/lstm_Synthetic_32.png b/benchmark/benchmark_result/lstm_Synthetic_32.png new file mode 100644 index 00000000000..e7da485f4a5 Binary files /dev/null and b/benchmark/benchmark_result/lstm_Synthetic_32.png differ diff --git a/benchmark/benchmark_result/lstm_Wikitext2_128.png b/benchmark/benchmark_result/lstm_Wikitext2_128.png new file mode 100644 index 00000000000..ffd517485b7 Binary files /dev/null and b/benchmark/benchmark_result/lstm_Wikitext2_128.png differ diff --git a/benchmark/benchmark_result/lstm_Wikitext2_32.png b/benchmark/benchmark_result/lstm_Wikitext2_32.png new file mode 100644 index 00000000000..1e9d9dbc417 Binary files /dev/null and b/benchmark/benchmark_result/lstm_Wikitext2_32.png differ diff --git a/benchmark/benchmark_result/rnn_mxnet_dataset.png b/benchmark/benchmark_result/rnn_mxnet_dataset.png deleted file mode 100644 index 3822339c1c5..00000000000 Binary files a/benchmark/benchmark_result/rnn_mxnet_dataset.png and /dev/null differ