Skip to content

Training and evaluation scripts for Large Language Models (LLMs) in an efficient way. You can train models with billions of parameters with few resources.

Notifications You must be signed in to change notification settings

AntoineBlanot/efficient-zero-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Efficient LLM

Introduction

Training and evaluation scripts for Large Language Models (LLMs) in an efficient way. You can train models with billions of parameters with few resources.

Installation

Please install conda (miniconda recommended) for the environment manager.

Once conda is installed, you can create and activate the environment using the following commads.

conda env create -f efficient-llm.yml
conda activate efficient-llm

Trained models

Model checkpoints can be find in this repo: https://github.com/AntoineBlanot/zero-nlp

Reported scores are accuracy on the validation set.

Model name Size MNLI (m) MNLI (mm) SNLI SciTail
3way-nli-mixture 5B 0.923 0.922 0.942 0.966

1. Training

To train your model efficiently, you can run a simple command and pass arguments throught it like the config file, model path and data path. You can also pass different hyperparameters like the batch size, learning rate, weight decay and many other.

You can check all the arguments supported right now in the args.py file.

Here is how a training script looks like:

CONFIG="config/single_gpu.yaml"
MODEL_PATH="AntoineBlanot/flan-t5-xxl-classif-3way"
DATA_PATH=...

RUN_NAME="3way-nli-mixture"

accelerate launch --config_file $CONFIG train.py \
  --pretrained_model_name_or_path $MODEL_PATH \
  --path $DATA_PATH --bs 32 --seq_length 128 \
  --output_dir "exp/$RUN_NAME" --lr 1e-4 --wd 0 \
  --max_steps 30178 --warmup_steps 3018 --log_steps 500 --eval_steps 5000 --save_steps 5000 \
  --wandb_project "efficient-llm" --wandb_name $RUN_NAME

2. Evaluation

To evaluate your model you need a similar command. The model path is the path to the checkpoint you want to evaluate on.

Here is how an evaluation script looks like:

CONFIG="config/single_gpu.yaml"
MODEL_PATH=".../checkpoint-..."
DATA_PATH=...

accelerate launch --config_file $CONFIG eval.py \
  --pretrained_model_name_or_path $MODEL_PATH \
  --path $DATA_PATH --bs 32 --seq_length 128

3. Inference

For zero-shot inference, please refer to another of my repo: https://github.com/AntoineBlanot/zero-nlp

Simple inference is not implemented for now...

About

Training and evaluation scripts for Large Language Models (LLMs) in an efficient way. You can train models with billions of parameters with few resources.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages