Skip to content

ictnlp/BoN-NAT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

458808b · Dec 31, 2019

History

24 Commits
Nov 26, 2019
Nov 11, 2019
Nov 22, 2019
Nov 22, 2019
Dec 31, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Dec 17, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019
Nov 22, 2019

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

PyTorch implementation of the models described in the paper Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation .

Dependencies

Python

  • Python 3.6
  • PyTorch >= 0.4
  • Numpy
  • NLTK
  • torchtext 0.2.1
  • torchvision
  • revtok
  • multiset
  • ipdb

Related code

Downloading Datasets

The original translation corpora can be downloaded from (IWLST'16 En-De, WMT'16 En-Ro, WMT'14 En-De). We recommend you to download the preprocessed corpora released in dl4mt-nonauto. Set correct path to data in data_path() function located in data.py before you run the code.

BoN-Joint

Combine the BoN objective and the cross-entropy loss to train NAT from scratch. This process usually takes about 5 days.

$ sh joint_wmt.sh

Take a checkpoint and train the length prediction model. This process usually takes about 1 day.

$ sh tune_wmt.sh

Decode the test set. This process usually takes about 20 seconds.

$ sh decode_wmt.sh

BoN-FT

First, train a NAT model using the cross-entropy loss. This process usually takes about 5 days.

$ sh mle_wmt.sh

Then, take a pre-trained checkpoint and finetune the NAT model using the BoN objective. This process usually takes about 3 hours.

$ sh bontune_wmt.sh

Take a finetuned checkpoint and train the length prediction model. This process usually takes about 1 day.

$ sh tune_wmt.sh

Decode the test set. This process usually takes about 20 seconds.

$ sh decode_wmt.sh

Reinforce-NAT

We also implement Reinforce-NAT (line 1294-1390) described in the paper Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation. See RSI-NAT for the usage.

Citation

If you find the resources in this repository useful, please consider citing:

@article{Shao:19,
  author    = {Chenze Shao, Jinchao Zhang, Yang Feng, Fandong Meng and Jie Zhou},
  title     = {Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation},
  year      = {2019},
  journal   = {arXiv preprint arXiv:1911.09320},
}

About

No description, website, or topics provided.

Resources

License

BSD-3-Clause, BSD-3-Clause licenses found

Licenses found

BSD-3-Clause
LICENSE
BSD-3-Clause
LICENSE_nyu

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published