Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

tensor2tensor versions & t2t-decoder #598

Closed
jh-hello opened this issue Feb 19, 2018 · 3 comments
Closed

tensor2tensor versions & t2t-decoder #598

jh-hello opened this issue Feb 19, 2018 · 3 comments
Labels

Comments

@jh-hello
Copy link

jh-hello commented Feb 19, 2018

Hi,
Different tensor2tensor versions decode different results. is it ok ? or is there anything I missed ?

I trained a transformer_base_single_gpu model using tensorflow 1.4.1 and tensor2tensor 1.4.2
and got the result like this:
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results INPUT: Goodbye world
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results OUTPUT: Die Welt Goodbye
2018-01-31 06:40:12,950 STDOUT INFO:tensorflow:Inference results INPUT: Hello world
2018-01-31 06:40:12,951 STDOUT INFO:tensorflow:Inference results OUTPUT: Hallo
It seems good and I can reproduce the same result using tensor2tensor 1.4.2.

Using tensor2tensor (1.5.2), the same transformer_base_single_gpu model makes this result :
INFO:tensorflow:Inference results INPUT: Goodbye world
INFO:tensorflow:Inference results OUTPUT: Sch\xf6ne Welt der Welt
INFO:tensorflow:Inference results INPUT: Hello world
INFO:tensorflow:Inference results OUTPUT: nutzte Welt Welt

  • Sch\xf6ne => Schöne

Other settings and command :
export PROBLEM=translate_ende_wmt8k
export MODEL=transformer
export HPARAMS=transformer_base_single_gpu
export BEAM_SIZE=4
export ALPHA=0.6

t2t-decoder
--data_dir=$DATA_DIR
--problems=$PROBLEM
--model=$MODEL
--hparams_set=$HPARAMS
--output_dir=$TRAIN_DIR
--decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA"
--decode_from_file=$DECODE_FILE

@martinpopel
Copy link
Contributor

It is not clear whether you have re-trained the model with 1.5.2 or if you used the old checkpoint trained with 1.4.2. In other words, it would be interesting to know whether this is a training-related issue or decoding-related issue (or cross-version checkpoint compatibility issue).
In the first case some differences should be expected, but it would be interesting to report if BLEU on a reasonably large test set has changed significantly (as e.g in #529 for 1.2.9 vs 1.3.0 and another differences were reported for 1.4.1 vs. 1.4.2).

For completeness, you should report also the TF version you used with T2T 1.5.2 (I guess it is TF 1.5)?

@jh-hello
Copy link
Author

jh-hello commented Feb 20, 2018

The only difference is T2T version when decoding (T2T 1.4.2 vs. 1.5.2) and other settings are same.
I trained once a model with T2T 1.4.2 and TF 1.4.1 and used the same checkpoint for decoding (there is no re-trained model with T2T 1.5.2).

In summary,
Training with T2T 1.4.2, TF 1.4.1
Decoding with T2T 1.4.2 and 1.5.2, TF 1.4.1

@rsepassi
Copy link
Contributor

rsepassi commented May 4, 2018

It may be that some parameters or functionality in the decode codepath changed between versions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants