Uses the nllb model released by meta to translate the alpaca dataset.
Make sure you have a working installation of Pytorch
-
git clone https://github.com/KhmerAILab/nllb-alpaca-dataset-translation
-
cd nllb-alpaca-dataset-translation
-
pip install huggingface_hub transformers pandas numpy tqdm accelerate bitsandbytes
-
set which dataset you want to translate at line 15
-
run python3 index.py
-
profit!