question about the bleu results #285

wingsyuan · 2018-12-05T10:19:48Z

Hi, I use the opennmt-tf and opennmt-py to train en-zh with same training dataset, and translate with the same test dataset too, but when I evaluate the two result, the score are very different below:

the bleu for opennmt-py:
perl ../opennmt-py/tools/multi-bleu.perl untest-test-muti.zh.token.bpe < untest-test-muti.zh.pred.txt
BLEU = 20.42, 59.7/30.6/18.6/12.3 (BP=0.803, ratio=0.820, hyp_len=226100, ref_len=275682)

the bleu for opennmt-tf:
perl ../opennmt-py/tools/multi-bleu.perl untest-test-muti.zh.token.bpe < untest-test-muti.en2zh
BLEU = 27.00, 61.7/34.6/22.5/15.8 (BP=0.915, ratio=0.919, hyp_len=253250, ref_len=275682)

Why? Do I miss something? and How to fine tune opennmt-py to get the similar result? thanks very much!

Process using opennmt-py:
python ../OpenNMT-py-master/preprocess.py -train_src ../data/bpe/untest-src-muti.en.token.bpe -train_tgt ../data/bpe/untest-tgt-muti.zh.token.bpe -valid_src ../data/bpe/untest-vali-muti.en.token.bpe -valid_tgt ../data/bpe/untest-vali-muti.zh.token.bpe -save_data data/data

python train.py -data ../data/data -save_model ../test2/model -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 30000 -max_generator_batches 2 -dropout 0.1 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 -world_size 4 -gpu_ranks 0 1 2 3 &

python ../opennmt-py/translate.py -model model_step_30000.pt -src untest-test-muti.en.token.bpe -output untest-test-muti.zh.pred.txt -replace_unk -verbose -gpu 4

Process using opennmt-tf:

onmt-main train_and_eval --model_type Transformer --config data.yml --auto_config --session_config gpuconfig --num_gpus 4

onmt-main infer --auto_config --config data.yml --features_file ../data/bpe/untest-test-muti.en.token.bpe >untest-test-muti.en2zh

guillaumekln · 2018-12-05T11:34:35Z

Let's keep the discussion in one location:

OpenNMT/OpenNMT-py#1093

guillaumekln closed this as completed Dec 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the bleu results #285

question about the bleu results #285

wingsyuan commented Dec 5, 2018

guillaumekln commented Dec 5, 2018

question about the bleu results #285

question about the bleu results #285

Comments

wingsyuan commented Dec 5, 2018

guillaumekln commented Dec 5, 2018