How to implement weight decay towards the pre-trained model? #1942

sedol1339 · 2024-10-03T11:22:28Z

sedol1339
Oct 3, 2024

Hello, let me one question.

If using axolotl for supervised fune-tuning, how do I implement penalizing the distance between starting and current weights? This was shown to be effective in https://arxiv.org/abs/1706.03610

NanoCode012 · 2024-10-09T04:43:29Z

NanoCode012
Oct 9, 2024
Collaborator

Please see https://axolotl-ai-cloud.github.io/axolotl/docs/config.html 's weight_decay section.

It would be passed to hf trainer https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.weight_decay

3 replies

sedol1339 Oct 9, 2024
Author

It looks like this is weight_decay towards zero weights, but not towards the pre-trained model.

NanoCode012 Oct 10, 2024
Collaborator

Oh, I thought you were referring to the L2 weight decay from the paper.

If you would like to implement your own loss, you may want to override the loss function (in the forward pass) in the model you want to train.

Otherwise, you may want to consider a RL method like DPO which calculates results from base model to determine the reward given.
https://huggingface.co/docs/trl/dpo_trainer

If you're interested in DPO, axolotl has support for that as well in the docs.

NanoCode012 Oct 31, 2024
Collaborator

Adding to the above, transformers has now updated their trainer to support adding custom loss function, so you could just pass that instead of overriding the forward pass! huggingface/transformers#34198

If it does help solve it for you, feel free to close this discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement weight decay towards the pre-trained model? #1942

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to implement weight decay towards the pre-trained model? #1942

sedol1339 Oct 3, 2024

Replies: 1 comment · 3 replies

NanoCode012 Oct 9, 2024 Collaborator

sedol1339 Oct 9, 2024 Author

NanoCode012 Oct 10, 2024 Collaborator

NanoCode012 Oct 31, 2024 Collaborator

sedol1339
Oct 3, 2024

Replies: 1 comment 3 replies

NanoCode012
Oct 9, 2024
Collaborator

sedol1339 Oct 9, 2024
Author

NanoCode012 Oct 10, 2024
Collaborator

NanoCode012 Oct 31, 2024
Collaborator