163M pretrain

pretrained models using 163M data presented in "Various Errors Improve Neural Grammatical Error Correction"

bpe.model / bpe.vocab
- models of sentencepiece (0.1.95)
{1..5}.pt
- pretrained models of fairseq (0.10.2)
- trained using 163M sentences x 20 epochs
- 1.pt
- 2.pt
- 3.pt
- 4.pt
- 5.pt
data-bin/dict.{src,trg}.txt
- vocabulary of fairseq models
note
- Please normalize inputs before applying BPE using reguligilo (https://github.com/nymwa/reguligilo).
- Please use -a option like reguligilo -a.
- input -> reguligilo -> bpe -> encoder-decoder -> remove bpe -> malreguligilo (denormalization) -> output
scores (beam=12 lenpen=0.6)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data-bin		data-bin
README.md		README.md
bpe.model		bpe.model
bpe.vocab		bpe.vocab
model_info.txt		model_info.txt

Provide feedback