Mowgli is a lightweight neural machine translation framework based on the Transformer network.
- Python version >= 3.8
- To install mowgli and develop locally:
# download code
git clone [email protected]:ltl-uva/mowgli.git
cd mowgli/
# create a virtual environment, for example using conda
conda create --name mowgli python==3.8
conda activate mowgli
# install
pip install --editable ./
Configuration is done through yaml
files. See configuration folder for a list of all options.
- Before training, data needs to be pre-processed (e.g. using Moses) and a vocabulary needs to be created. See
scripts/build_vocab.py
for details on vocabulary creation. - Training is done by pointing to a
yaml
file:python -m mowgli train configs/${YOUR_CONFIG}.yaml
- Inference is done by pointing to a
yaml
file:python -m mowgli test configs/${YOUR_CONFIG}.yaml
Mowgli is developed by David Stap (University of Amsterdam).
We take inspiration from other sequence-to-sequence frameworks such as Tensor2Tensor, fairseq, OpenNMT and JoeyNMT.
If you use mowgli, please cite the following paper:
@inproceedings{stap-etal-2023-viewing,
title = "Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens",
author = "Stap, David and
Niculae, Vlad and
Monz, Christof",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.findings-emnlp.998",
pages = "14973--14987",
}