Merge pull request #934 from vince62s/fix-doc

fix doc add changelog bump version
OpenNMT · Aug 31, 2018 · 6db7ec1 · 6db7ec1
2 parents e60a54f + 7974ec9
commit 6db7ec1
Show file tree

Hide file tree

Showing 4 changed files with 64 additions and 8 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,42 @@
+
+**Notes on versioning**
+
+
+## [Unreleased]
+
+### New features
+
+### Fixes and improvements
+
+
+## [0.2.1](https://github.com/OpenNMT/OpenNMT-py/tree/v0.2.1) (2018-08-31)
+
+### Fixes and improvements
+
+* First compatibility steps with Pytorch 0.4.1 (non breaking)
+* Fix TranslationServer (when various request try to load the same model at the same time)
+* Fix StopIteration error (python 3.7)
+
+### New features
+* Ensemble at inference (thanks @Waino)
+
+## [0.2](https://github.com/OpenNMT/OpenNMT-py/tree/v0.2) (2018-08-28)
+
+### improvements
+
+* Compatibility fixes with Pytorch 0.4 / Torchtext 0.3
+* Multi-GPU based on Torch Distributed
+* Average Attention Network (AAN) for the Transformer (thanks @francoishernandez )
+* New fast beam search (see -fast in translate.py) (thanks @guillaumekln)
+* Sparse attention / sparsemax (thanks to @bpopeters)
+* Refactoring of many parts of the code base:
+ - change from -epoch to -train_steps -valid_steps (see opts.py)
+ - reorg of the logic train => train_multi / train_single => trainer
+* Many fixes / improvements in the translationserver (thanks @pltrdy @francoishernandez)
+* fix BPTT
+
+## [0.1](https://github.com/OpenNMT/OpenNMT-py/tree/v0.1) (2018-06-08)
+
+### First and Last Release using Pytorch 0.3.x
+
+
diff --git a/README.md b/README.md
@@ -41,17 +41,17 @@ The following OpenNMT features are implemented:
 - [data preprocessing](http://opennmt.net/OpenNMT-py/options/preprocess.html)
 - [Inference (translation) with batching and beam search](http://opennmt.net/OpenNMT-py/options/translate.html)
 - [Multiple source and target RNN (lstm/gru) types and attention (dotprod/mlp) types](http://opennmt.net/OpenNMT-py/options/train.html#model-encoder-decoder)
-- [TensorBoard/Crayon logging](http://opennmt.net/OpenNMT-py/options/train.html#logging)
+- [TensorBoard](http://opennmt.net/OpenNMT-py/options/train.html#logging)
 - [Source word features](http://opennmt.net/OpenNMT-py/options/train.html#model-embeddings)
 - [Pretrained Embeddings](http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-pretrained-embeddings-e-g-glove)
 - [Copy and Coverage Attention](http://opennmt.net/OpenNMT-py/options/train.html#model-attention)
 - [Image-to-text processing](http://opennmt.net/OpenNMT-py/im2text.html)
 - [Speech-to-text processing](http://opennmt.net/OpenNMT-py/speech2text.html)
 - ["Attention is all you need"](http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model)
+- [Multi-GPU](http://opennmt.net/OpenNMT-py/FAQ.html##do-you-support-multi-gpu)
 - Inference time loss functions.
 
 Beta Features (committed):
-- multi-GPU
 - Structured attention
 - [Conv2Conv convolution model]
 - SRU "RNNs faster than CNN" paper
@@ -131,6 +131,9 @@ http://opennmt.net/Models-py/
 
 ## Citation
 
+[OpenNMT: Neural Machine Translation Toolkit](https://arxiv.org/pdf/1805.11462)
+
+
 [OpenNMT technical report](https://doi.org/10.18653/v1/P17-4012)
 
 ```

diff --git a/docs/source/FAQ.md b/docs/source/FAQ.md
@@ -70,13 +70,13 @@ setup. We have confirmed the following command can replicate their WMT results.
 
 ```
 python  train.py -data /tmp/de2/data -save_model /tmp/extra -gpuid 1 \
-        -layers 6 -rnn_size 512 -word_vec_size 512   \
+        -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8  \
         -encoder_type transformer -decoder_type transformer -position_encoding \
-        -train_steps 100000  -max_generator_batches 32 -dropout 0.1 \
-        -batch_size 4096 -batch_type tokens -normalization tokens  -accum_count 4 \
+        -train_steps 200000  -max_generator_batches 2 -dropout 0.1 \
+        -batch_size 4096 -batch_type tokens -normalization tokens  -accum_count 2 \
         -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 \
         -max_grad_norm 0 -param_init 0  -param_init_glorot \
-        -label_smoothing 0.1  
+        -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 -gpuid 0 1 2 3 
 ```
 
 Here are what each of the parameters mean:
@@ -87,9 +87,20 @@ Here are what each of the parameters mean:
 * `batch_type tokens`, `normalization tokens`, `accum_count 4`: batch and normalize based on number of tokens and not sentences. Compute gradients based on four batches. 
 - `label_smoothing 0.1`: use label smoothing loss. 
 
+* `gpuid 0 1 2 3 accum_count 2`: This will use 4 GPU and accumulate over 2 batches before updating parameters, this will emulate using 8 GPUS.
+
 
 ## Do you support multi-gpu?
 
-Currently our system does not support multi-gpu. It will be coming soon. 
+Yes !
+First you need to make sure you export CUDA_VISIBLE_DEVICES=0,1,2,3
+Then use -gpuid 0 1 2 3
+If you want to use GPU id 1 and 3 of your OS, you will need to export CUDA_VISIBLE_DEVICES=1,3
+then use -gpuid 0 1
+
+## How can I ensemble Models at inference?
+
+You can specify several models in the translate.py command line: -model model1_seed1 model2_seed2
+Bear in mind that your models must share the same traget vocabulary.
 
 
diff --git a/setup.py b/setup.py
@@ -4,7 +4,7 @@
 
 setup(name='OpenNMT-py',
       description='A python implementation of OpenNMT',
-      version='0.2',
+      version='0.2.1',
 
       packages=['onmt', 'onmt.encoders', 'onmt.modules', 'onmt.tests',
                 'onmt.translate', 'onmt.decoders', 'onmt.inputters',