Why is it recommend to set `load_in_8bit: true` for LORA finetuning? #1611

rudolpheric · 2024-05-12T20:12:26Z

rudolpheric
May 12, 2024

I startet to experment lora finetuning and since I have enough memory and still the model always gets worse through lora finetuning I am wondering why this is the case. I saw a warning in the logs that it is recommended to quantize to 8 bit. Why is this recommended? Shouldn't the model loose performance through quantisation?

Answered by NanoCode012

Oct 24, 2024

Hey, this reply is a bit late, but I hope I can clarify this for future readers.

The reason is that when you do LORA fine-tuning, you're fine-tuning a separate set of weights. It is recommended that you load the base model in 8bit to save vram which allows you to set higher batch size + train faster.

and still the model always gets worse through lora finetuning

Could this be an issue with the dataset?

Shouldn't the model loose performance through quantisation?

Once merged, your model will be outputted as fp16.

View full answer

NanoCode012 · 2024-10-24T14:08:01Z

NanoCode012
Oct 24, 2024
Collaborator

Hey, this reply is a bit late, but I hope I can clarify this for future readers.

The reason is that when you do LORA fine-tuning, you're fine-tuning a separate set of weights. It is recommended that you load the base model in 8bit to save vram which allows you to set higher batch size + train faster.

and still the model always gets worse through lora finetuning

Could this be an issue with the dataset?

Shouldn't the model loose performance through quantisation?

Once merged, your model will be outputted as fp16.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is it recommend to set `load_in_8bit: true` for LORA finetuning? #1611

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Why is it recommend to set load_in_8bit: true for LORA finetuning? #1611

rudolpheric May 12, 2024

Replies: 1 comment

NanoCode012 Oct 24, 2024 Collaborator

Why is it recommend to set `load_in_8bit: true` for LORA finetuning? #1611

rudolpheric
May 12, 2024

NanoCode012
Oct 24, 2024
Collaborator