Multilingual IWSLT tst2017 test set is broken #15

stephanpeitz · 2020-01-24T21:28:44Z

Hi,

just realised that the preprocessed data provided under https://github.com/quanpn90/NMTGMinor/tree/master/recipes/multilingual-translation are not correct.

In particular, the test sets tst2017 contain ~3k-4k lines while it should be ~1.1k.
Furthermore, the references are not correct, e.g. the first ~2k lines of tst2017.en-de.bpe.de are in English rather than German.

My guess is that you accidentally mixed in tst2010.

You might want to fix that otherwise people could assume you computed your BLEU scores based on these incorrect test sets.

Cheers,
Stephan

The text was updated successfully, but these errors were encountered:

quanpn90 · 2020-01-24T21:32:06Z

Thank you so much for reminding me, I will try to fix it asap, especially when recently I went back to this dataset for some fun things. Best regards, Quan

…

On Fri, 24 Jan 2020 at 22:28, Stephan Peitz ***@***.***> wrote: Hi, just realised that the preprocessed data provided under https://github.com/quanpn90/NMTGMinor/tree/master/recipes/multilingual-translation are not correct. In particular, the test sets tst2017 contain ~3k-4k lines while it should be ~1.1k. Furthermore, the references are not correct, e.g. the first ~2k lines of tst2017.en-de.bpe.de are in English rather than German. My guess is that you accidentally mixed in tst2010. You might want to fix that otherwise people could assume you computed your BLEU scores based on these incorrect test sets. Cheers, Stephan — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#15?email_source=notifications&email_token=ADTOFOKBF7RG377OTFTS4RLQ7NMQ3A5CNFSM4KLMK3T2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IIUB3RA>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADTOFOL3ETHCSBDWM7QHHQDQ7NMQ3ANCNFSM4KLMK3TQ> .

stephanpeitz · 2020-02-04T21:45:53Z

Hi Quan,

have you been able to fix it?

Cheers,
Stephan

quanpn90 · 2020-02-05T18:10:39Z

Hi Stephan, thank you for reminding me.

I think I did a terrible mistake for these test sets due to a mistake in preprocessing. Basically the test set 2017 were duplicated (twice) so the BLEU score in the paper is possibly not correct.

If you don't mind I will run the translation again because the models are still here and put the correct results here.

Sorry for this mistake.
Quan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multilingual IWSLT tst2017 test set is broken #15

Multilingual IWSLT tst2017 test set is broken #15

stephanpeitz commented Jan 24, 2020

quanpn90 commented Jan 24, 2020 via email

stephanpeitz commented Feb 4, 2020

quanpn90 commented Feb 5, 2020

Multilingual IWSLT tst2017 test set is broken #15

Multilingual IWSLT tst2017 test set is broken #15

Comments

stephanpeitz commented Jan 24, 2020

quanpn90 commented Jan 24, 2020 via email

stephanpeitz commented Feb 4, 2020

quanpn90 commented Feb 5, 2020