Skip to content

Latest commit

 

History

History

llm-finetuning-2

Finetuning Mistral-7B using LoRA and DeepSpeed

In this demo, we finetune Mistral-7B using LoRA and DeepSpeed. We ran LoRA on two 80 GB A100 GPUs, and DeepSpeed on two, four, and eight 80 GB A100 GPUs.

To get started, first install Determined on your local machine:

pip install determined

Then finetune with LoRA:

det e create lora.yaml . 

Or finetune with DeepSpeed:

det e create deepspeed.yaml . 

You can view the actual training code in finetune.py.

Configuration

Change configuration options in lora.yaml or deepspeed.yaml. Some important options are:

  • slots_per_trial: the number of GPUs to use.
  • dataset_subset: the difficulty subset to train on.
  • per_device_train_batch_size: the batch size per GPU.

The results in our blog post were obtained using per_device_train_batch_size: 1 and per_device_eval_batch_size: 4

DeepSpeed configuration files are in the ds_configs folder.

Testing

Test your model's generation capabilities:

python inference.py --exp_id <exp_id> --dataset_subset <dataset_subset>

Where

  • <exp_id> is the id of your finetuning experiment in the Determined UI.
  • <dataset_subset> is one of "easy", "medium", or "hard".

If you're testing a LoRA model, then add --lora to the above command.

To use CPU instead of GPU, add --device cpu.

To test the pretrained model (not finetuned), leave out --exp_id. For example:

python inference.py --dataset_subset easy

Validating the tokenizer

Plot the distribution of dataset sample lengths, and see how many samples will be truncated by the tokenizer:

python validate_tokenizer.py

Contributors