Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about the bleu results #285

Closed
wingsyuan opened this issue Dec 5, 2018 · 1 comment
Closed

question about the bleu results #285

wingsyuan opened this issue Dec 5, 2018 · 1 comment

Comments

@wingsyuan
Copy link

Hi, I use the opennmt-tf and opennmt-py to train en-zh with same training dataset, and translate with the same test dataset too, but when I evaluate the two result, the score are very different below:

the bleu for opennmt-py:
perl ../opennmt-py/tools/multi-bleu.perl untest-test-muti.zh.token.bpe < untest-test-muti.zh.pred.txt
BLEU = 20.42, 59.7/30.6/18.6/12.3 (BP=0.803, ratio=0.820, hyp_len=226100, ref_len=275682)

the bleu for opennmt-tf:
perl ../opennmt-py/tools/multi-bleu.perl untest-test-muti.zh.token.bpe < untest-test-muti.en2zh
BLEU = 27.00, 61.7/34.6/22.5/15.8 (BP=0.915, ratio=0.919, hyp_len=253250, ref_len=275682)

Why? Do I miss something? and How to fine tune opennmt-py to get the similar result? thanks very much!

Process using opennmt-py:
python ../OpenNMT-py-master/preprocess.py -train_src ../data/bpe/untest-src-muti.en.token.bpe -train_tgt ../data/bpe/untest-tgt-muti.zh.token.bpe -valid_src ../data/bpe/untest-vali-muti.en.token.bpe -valid_tgt ../data/bpe/untest-vali-muti.zh.token.bpe -save_data data/data

python train.py -data ../data/data -save_model ../test2/model -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 30000 -max_generator_batches 2 -dropout 0.1 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 -world_size 4 -gpu_ranks 0 1 2 3 &

python ../opennmt-py/translate.py -model model_step_30000.pt -src untest-test-muti.en.token.bpe -output untest-test-muti.zh.pred.txt -replace_unk -verbose -gpu 4

Process using opennmt-tf:

onmt-main train_and_eval --model_type Transformer --config data.yml --auto_config --session_config gpuconfig --num_gpus 4

onmt-main infer --auto_config --config data.yml --features_file ../data/bpe/untest-test-muti.en.token.bpe >untest-test-muti.en2zh

@guillaumekln
Copy link
Contributor

Let's keep the discussion in one location:

OpenNMT/OpenNMT-py#1093

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants