Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The -src_seq_length parameter is not working? #2040

Closed
chijianlei opened this issue Apr 15, 2021 · 1 comment
Closed

The -src_seq_length parameter is not working? #2040

chijianlei opened this issue Apr 15, 2021 · 1 comment

Comments

@chijianlei
Copy link

I'm trying to reproduce the results on WMT'14 ENDE datasets of "Attention is All You Need"?
I have followed the discussion in #637 but my OpenNMT version is 2.0.1.
This link tells me I need to set the sequence length to 100. However, the training process throws an exception that "Sequence is 12131 but PositionalEncoding is limited to 5000. See max_len argument."
I have checked the code and it seems that the seuqnece length is still too long?
My configuration is shown as follows:

Where the samples will be written

save_data: wmt_ende_sp/transformer

Where the vocab(s) will be written

src_vocab: wmt_ende_sp/transformer.vocab.src
tgt_vocab: wmt_ende_sp/transformer.vocab.tgt

Prevent overwriting existing files in the folder

overwrite: True
src_seq_length: 100
tgt_seq_length: 100
share_vocab: True

Corpus opts:

data:
corpus:
path_src: wmt_ende_sp/train.en
path_tgt: wmt_ende_sp/train.de
valid:
path_src: wmt_ende_sp/valid.en
path_tgt: wmt_ende_sp/valid.de

Vocabulary files that were just created

src_vocab: wmt_ende_sp/transformer.vocab
tgt_vocab: wmt_ende_sp/transformer.vocab
src_vocab_size: 32000
tgt_vocab_size: 32000

Training

save_model: wmt_ende_sp/tf.model
save_checkpoint_steps: 10000
valid_steps: 10000
train_steps: 200000

Batching

batch_type: "tokens"
batch_size: 4096
max_generator_batches: 2
accum_count: [4]
accum_steps: [0]

Optimization

optim: "adam"
learning_rate: 2
warmup_steps: 8000
decay_method: "noam"
adam_beta2: 0.998
max_grad_norm: 0
label_smoothing: 0.1
param_init: 0
param_init_glorot: True
normalization: "tokens"

Model

encoder_type: transformer
decoder_type: transformer
position_encoding: True
layers: 6
heads: 8
rnn_size: 512
word_vec_size: 512
transformer_ff: 2048
dropout: [0.1]
share_embeddings: True

Train on a single GPU

world_size: 2
gpu_ranks: [0, 1]

I also tried to set the parameter -src_seq_length_trunc to 100, the training can be done successfully now.
Does that mean the -src_seq_length parameter is not working?

@francoishernandez
Copy link
Member

These flags are used by the filtertoolong transform: https://opennmt.net/OpenNMT-py/FAQ.html#filter-examples-by-length
You need to add this transform to your config, like here for instance for it to be enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants