tensor2tensor versions & t2t-decoder #598

jh-hello · 2018-02-19T04:02:17Z

Hi,
Different tensor2tensor versions decode different results. is it ok ? or is there anything I missed ?

I trained a transformer_base_single_gpu model using tensorflow 1.4.1 and tensor2tensor 1.4.2
and got the result like this:
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results INPUT: Goodbye world
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results OUTPUT: Die Welt Goodbye
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results INPUT: Hello world
2018-01-31 06:40:12,951 STDOUT INFO:tensorflow:Inference results OUTPUT: Hallo
It seems good and I can reproduce the same result using tensor2tensor 1.4.2.

Using tensor2tensor (1.5.2), the same transformer_base_single_gpu model makes this result :
INFO:tensorflow:Inference results INPUT: Goodbye world
INFO:tensorflow:Inference results OUTPUT: Sch\xf6ne Welt der Welt
INFO:tensorflow:Inference results INPUT: Hello world
INFO:tensorflow:Inference results OUTPUT: nutzte Welt Welt

Sch\xf6ne => Schöne

Other settings and command :
export PROBLEM=translate_ende_wmt8k
export MODEL=transformer
export HPARAMS=transformer_base_single_gpu
export BEAM_SIZE=4
export ALPHA=0.6

t2t-decoder
--data_dir=$DATA_DIR
--problems=$PROBLEM
--model=$MODEL
--hparams_set=$HPARAMS
--output_dir=$TRAIN_DIR
--decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA"
--decode_from_file=$DECODE_FILE

martinpopel · 2018-02-19T12:41:13Z

It is not clear whether you have re-trained the model with 1.5.2 or if you used the old checkpoint trained with 1.4.2. In other words, it would be interesting to know whether this is a training-related issue or decoding-related issue (or cross-version checkpoint compatibility issue).
In the first case some differences should be expected, but it would be interesting to report if BLEU on a reasonably large test set has changed significantly (as e.g in #529 for 1.2.9 vs 1.3.0 and another differences were reported for 1.4.1 vs. 1.4.2).

For completeness, you should report also the TF version you used with T2T 1.5.2 (I guess it is TF 1.5)?

jh-hello · 2018-02-20T02:01:40Z

The only difference is T2T version when decoding (T2T 1.4.2 vs. 1.5.2) and other settings are same.
I trained once a model with T2T 1.4.2 and TF 1.4.1 and used the same checkpoint for decoding (there is no re-trained model with T2T 1.5.2).

In summary,
Training with T2T 1.4.2, TF 1.4.1
Decoding with T2T 1.4.2 and 1.5.2, TF 1.4.1

rsepassi · 2018-05-04T02:20:58Z

It may be that some parameters or functionality in the decode codepath changed between versions.

rsepassi added the question label May 4, 2018

jh-hello closed this as completed Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensor2tensor versions & t2t-decoder #598

tensor2tensor versions & t2t-decoder #598

jh-hello commented Feb 19, 2018 •

edited

Loading

martinpopel commented Feb 19, 2018

jh-hello commented Feb 20, 2018 •

edited

Loading

rsepassi commented May 4, 2018

tensor2tensor versions & t2t-decoder #598

tensor2tensor versions & t2t-decoder #598

Comments

jh-hello commented Feb 19, 2018 • edited Loading

martinpopel commented Feb 19, 2018

jh-hello commented Feb 20, 2018 • edited Loading

rsepassi commented May 4, 2018

jh-hello commented Feb 19, 2018 •

edited

Loading

jh-hello commented Feb 20, 2018 •

edited

Loading