The -src_seq_length parameter is not working? #2040

chijianlei · 2021-04-15T15:43:47Z

I'm trying to reproduce the results on WMT'14 ENDE datasets of "Attention is All You Need"?
I have followed the discussion in #637 but my OpenNMT version is 2.0.1.
This link tells me I need to set the sequence length to 100. However, the training process throws an exception that "Sequence is 12131 but PositionalEncoding is limited to 5000. See max_len argument."
I have checked the code and it seems that the seuqnece length is still too long?
My configuration is shown as follows:

Where the samples will be written

save_data: wmt_ende_sp/transformer

Where the vocab(s) will be written

src_vocab: wmt_ende_sp/transformer.vocab.src
tgt_vocab: wmt_ende_sp/transformer.vocab.tgt

Prevent overwriting existing files in the folder

overwrite: True
src_seq_length: 100
tgt_seq_length: 100
share_vocab: True

Corpus opts:

data:
corpus:
path_src: wmt_ende_sp/train.en
path_tgt: wmt_ende_sp/train.de
valid:
path_src: wmt_ende_sp/valid.en
path_tgt: wmt_ende_sp/valid.de

Vocabulary files that were just created

src_vocab: wmt_ende_sp/transformer.vocab
tgt_vocab: wmt_ende_sp/transformer.vocab
src_vocab_size: 32000
tgt_vocab_size: 32000

Training

save_model: wmt_ende_sp/tf.model
save_checkpoint_steps: 10000
valid_steps: 10000
train_steps: 200000

Batching

batch_type: "tokens"
batch_size: 4096
max_generator_batches: 2
accum_count: [4]
accum_steps: [0]

Optimization

optim: "adam"
learning_rate: 2
warmup_steps: 8000
decay_method: "noam"
adam_beta2: 0.998
max_grad_norm: 0
label_smoothing: 0.1
param_init: 0
param_init_glorot: True
normalization: "tokens"

Model

encoder_type: transformer
decoder_type: transformer
position_encoding: True
layers: 6
heads: 8
rnn_size: 512
word_vec_size: 512
transformer_ff: 2048
dropout: [0.1]
share_embeddings: True

Train on a single GPU

world_size: 2
gpu_ranks: [0, 1]

I also tried to set the parameter -src_seq_length_trunc to 100, the training can be done successfully now.
Does that mean the -src_seq_length parameter is not working?

francoishernandez · 2021-04-15T16:07:39Z

These flags are used by the filtertoolong transform: https://opennmt.net/OpenNMT-py/FAQ.html#filter-examples-by-length
You need to add this transform to your config, like here for instance for it to be enabled.

francoishernandez closed this as completed Apr 15, 2021

francoishernandez mentioned this issue May 25, 2021

Cuda out of memory when traning with fp16 #2065

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The -src_seq_length parameter is not working? #2040

The -src_seq_length parameter is not working? #2040

chijianlei commented Apr 15, 2021

francoishernandez commented Apr 15, 2021

The -src_seq_length parameter is not working? #2040

The -src_seq_length parameter is not working? #2040

Comments

chijianlei commented Apr 15, 2021

Where the samples will be written

Where the vocab(s) will be written

Prevent overwriting existing files in the folder

Corpus opts:

Vocabulary files that were just created

Training

Batching

Optimization

Model

Train on a single GPU

francoishernandez commented Apr 15, 2021