Skip to content
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.

Commit

Permalink
Added Detailed RNN results (#77)
Browse files Browse the repository at this point in the history
* Modified RNN benchmark document

* Added minor comments

* fixed broken image link
  • Loading branch information
karan6181 authored and sandeep-krishnamurthy committed Jun 15, 2018
1 parent 8edb8bf commit bd97766
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 19 deletions.
52 changes: 33 additions & 19 deletions benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,50 +94,64 @@ Example Usage:

We used benchmark scripts from [TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) official repo, and modified slightly for our use case.

Put `lstm_synthetic` as models parameter in `run_<backend-type>_backend.sh`

### Nietzsche Dataset

We have used an official Keras LSTM example scripts [lstm_text_generation.py](https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py), and modified slightly for our use case.

Put `lstm_nietzsche` as models parameter in `run_<backend-type>_backend.sh`

### WikiText-2 Dataset

We have used an official WikiText-2 character level Dataset from this [link](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset).

The `lstm_text_generation.py` includes a dataset that is hosted on S3 bucket from this [link](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip) (This is a WikiText-2 raw character level data).

Put `lstm_wikitext2` as models parameter in `run_<backend-type>_backend.sh`

### RNN Benchmark Results

Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.18xLarge(CPU), C5.xLarge(CPU), and P3.8xLarge(1, 4 GPUs) instance with MXNet backend. Batch size is 128. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/).
Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.18xLarge(CPU), C5.xLarge(CPU), and P3.8xLarge(1, 4 GPUs) instance with MXNet and TensorFlow backend. Batch size is 128 and set `unroll=True` in Recurrent layer. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/).

For more detailed benchmark results, please refer to [RNN results.](benchmark_result/RNN_result.md)

| Framework/Library | Version |
| ----------------- | ------- |
| :----------------- | :------- |
| Keras | 2.1.5 |
| MXNet | 1.1.0 |
| TensorFlow | 1.7.0 |
| CUDA | 9.0.176 |

#### 1. Synthetic Dataset

![lstm_Synthetic_128.png](benchmark_result/lstm_Synthetic_128.png)

| Instance | GPUs | MXNet Backend<br />Speed/Epoch | TensorFlow Backend<br />Speed/Epoch |
| :----------- | :---- | :------------------------------ | :----------------------------------- |
| C5.18xLarge | 0 | 24s 485us/step | 14s 284us/step |
| P3.8xLarge | 1 | 13s 261us/step | 12s 249us/step |
| P3.8xLarge | 4 | 12s 240us/step | 21s 430us/step |

#### 2. Nietzsche Dataset

![lstm_Nietzsche_128.png](benchmark_result/lstm_Nietzsche_128.png)

| Instance | GPUs | Data Set | Speed/Epoch (Lower is better) |
| ----------- | ---- | ---------- | ----------------------------- |
| C5.18xLarge | 0 | Synthetic | 24s 485us/step |
| C5.xLarge | 0 | Synthetic | 93s 2ms/step |
| P3.8xLarge | 1 | Synthetic | 13s 261us/step |
| P3.8xLarge | 4 | Synthetic | 12s 240us/step |
| | | | |
| C5.18xLarge | 0 | Nietzsche | 78s 389us/step |
| C5.xLarge | 0 | Nietzsche | 360s 2ms/step |
| P3.8xLarge | 1 | Nietzsche | 52s 262us/step |
| P3.8xLarge | 4 | Nietzsche | 47s 235us/step |
| | | | |
| C5.18xLarge | 0 | WikiText-2 | 1345s 398us/step |
| C5.xLarge | 0 | WikiText-2 | 6417s 2ms/step |
| P3.8xLarge | 1 | WikiText-2 | 868s 257us/step |
| P3.8xLarge | 4 | WikiText-2 | 775s 229us/step |
| Instance | GPUs | MXNet Backend<br />Speed/Epoch | TensorFlow Backend<br />Speed/Epoch |
| :----------- | :---- | :------------------------------ | :----------------------------------- |
| C5.18xLarge | 0 | 78s 389us/step | 55s 273us/step |
| P3.8xLarge | 1 | 52s 262us/step | 51s 252us/step |
| P3.8xLarge | 4 | 47s 235us/step | 87s 435us/step |

#### 3. WikiText-2 Dataset

![rnn_mxnet_dataset](benchmark_result/rnn_mxnet_dataset.png)
![lstm_Wikitext2_128.png](benchmark_result/lstm_Wikitext2_128.png)

| Instance | GPUs | MXNet Backend<br />Speed/Epoch | TensorFlow Backend<br />Speed/Epoch |
| :----------- | :---- | :------------------------------ | :----------------------------------- |
| C5.18xLarge | 0 | 1345s 398us/step | 875s 259us/step |
| P3.8xLarge | 1 | 868s 257us/step | 817s 242us/step |
| P3.8xLarge | 4 | 775s 229us/step | 1468s 434us/step |
## Credits

Synthetic Data scripts modified from
Expand Down
20 changes: 20 additions & 0 deletions benchmark/benchmark_result/RNN_result.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Detailed RNN Benchmark Results

Below is the result of GPU memory usage while running LSTM model on Synthetic, Nietzsche, and WikiText-2 character level dataset.

![MemoryConsumption.png](MemoryConsumption.png)

Note: All the data for performance diagram shown below is taken from the cell having `unroll Type=True`
## Synthetic Dataset

### Configuration
Expand All @@ -17,6 +22,11 @@

### Results

| | |
| :------------------------------------------------------- | :----------------------------------------------------------- |
| ![lstm_Synthetic_32.png](lstm_Synthetic_32.png) | ![lstm_Synthetic_128.png](lstm_Synthetic_128.png) |


| Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) |
| ----------- | ---- | ---------- | ---------- | --------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- |
| C5.18xLarge | 0 | MXNet | 32 | Synthetic | fit() | 50s 1ms/step | TRUE | 50000 | 0 |
Expand Down Expand Up @@ -65,6 +75,11 @@

### Results

| | |
| :------------------------------------------------------- | :----------------------------------------------------------- |
| ![lstm_Nietzsche_32.png](lstm_Nietzsche_32.png) | ![lstm_Nietzsche_128.png](lstm_Nietzsche_128.png) |


| Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) |
| ----------- | ---- | ---------- | ---------- | --------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- |
| C5.18xLarge | 0 | MXNet | 32 | Nietzsche | fit() | 226s 1ms/step | TRUE | 200285 | 0 |
Expand Down Expand Up @@ -113,6 +128,11 @@

### Results

| | |
| :------------------------------------------------------- | :----------------------------------------------------------- |
| ![lstm_Wikitext2_32.png](lstm_Wikitext2_32.png) | ![lstm_Wikitext2_128.png](lstm_Wikitext2_128.png) |


| Instance | GPUs | Backend | Batch size | Data Set | Training Method | Speed/Epoch (Lower is better) | Unroll Type | No. of samples | Memory(MiB) |
| ----------- | ---- | ---------- | ---------- | ---------- | ---------------- | ----------------------------- | ----------- | -------------- | ----------- |
| C5.18xLarge | 0 | MXNet | 32 | WikiText-2 | fit() | 3530s 1ms/step | TRUE | 1562175 | 0 |
Expand Down
Binary file removed benchmark/benchmark_result/rnn_mxnet_dataset.png
Binary file not shown.

0 comments on commit bd97766

Please sign in to comment.