Trainer._load_from_checkpoint - support loading multiple Peft adapters #30505

claralp · 2024-04-26T14:59:13Z

What does this PR do?

Since it it possible to have multiple Peft adapters in the same model, it should also be possible to resume a training of such models from checkpoint with transformers.Trainer.train(resume_from_checkpoint=True|"path").
No documentation changes because this is a fix of sth already existing, tested with DPO/KTO

Fixes #30478

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@lewtun @kashif @younesbelkada

kashif · 2024-04-26T15:00:39Z

nice @claralp would you mind also adding some tests?

claralp · 2024-04-26T15:22:44Z

@kashif will try. Guess I can use a similar model setup as here test_trainer.py#L933

Another question: How is the checkpoint loading with DeepSpeed supposed to work with this trainer.py#L1847?

younesbelkada

Thanks for adding the support for loading multiple adapters with load_from_checkpoint!

lewtun

Very clean implementation @claralp !

Guess I can use a similar model setup as here test_trainer.py#L933

Yeah this makes sense and I think you should be able to adapt the logic used for full training here:

transformers/tests/trainer/test_trainer.py

Line 1787 in 1e05671

def test_can_resume_training(self):

HuggingFaceDocBuilderDev · 2024-05-01T14:48:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

claralp · 2024-05-02T13:22:12Z

@kashif @lewtun please check if this works now for you.

Note: Discovered another bug while implementing this.
While the PeftAdapterMixin.set_adapter function supports multiple active adapters, its internal logic for setting multiple adapters active is broken.
But this needs to be fixed in the Peft library first

kashif · 2024-05-02T13:28:16Z

ok @claralp let's link the peft issue here too then?

younesbelkada

Thanks again ! seconded what @kashif said ! if you could have a small reproducer to repro the bug and file it on PEFT it would be really great 🙏

claralp · 2024-05-03T09:22:23Z

@kashif, @younesbelkada the issue cannot be reproduced anymore.

The background:
Since multiple adapters can be active at the same time I would have expected the same in src/peft/utils/other.py#L272.
I had an error previously when calling this with a list of adapters in src/transformers/integrations/peft.py#L311.
This was before commits/16bd0ef66f642ef4b69f4f21cfc368d548318672 when I accessed the active_adapters property instead of active_adapter.
But when I now run a test case with two active adapters, it works.
So I would treat this as solved

claralp · 2024-05-06T09:07:30Z

@kashif @younesbelkada @lewtun is there any open point for me to fix now or is this ready?

younesbelkada · 2024-05-06T09:10:04Z

On my end looks great, just waiting for a final review cc @ArthurZucker @LysandreJik

LysandreJik

This looks good to me! cc @muellerzr if you're fine with the changes in Trainer, feel free to merge.

muellerzr

LG2M, thanks for also adding a test!

Clara Luise Pohland and others added 4 commits April 25, 2024 11:04

Trainer: load checkpoint model with multiple adapters

8ac83b5

Merge branch 'huggingface:main' into main

04a77ac

Merge branch 'huggingface:main' into main

7d38e25

Trainer._load_from_checkpoint support multiple active adapters

d043121

PeftModel.set_adapter does not support multiple adapters yet

16bd0ef

younesbelkada approved these changes Apr 30, 2024

View reviewed changes

lewtun reviewed May 1, 2024

View reviewed changes

claralp force-pushed the trainer_checkpoint_fix_multiple_adapters branch from 15429ad to fe7044e Compare May 2, 2024 13:14

kashif approved these changes May 2, 2024

View reviewed changes

Trainer._load_from_checkpoint test multiple adapters

1d6d481

claralp force-pushed the trainer_checkpoint_fix_multiple_adapters branch from fe7044e to 1d6d481 Compare May 2, 2024 13:27

younesbelkada approved these changes May 2, 2024

View reviewed changes

younesbelkada requested a review from ArthurZucker May 3, 2024 09:25

younesbelkada requested a review from LysandreJik May 6, 2024 09:10

LysandreJik approved these changes May 6, 2024

View reviewed changes

muellerzr approved these changes May 6, 2024

View reviewed changes

muellerzr merged commit e076953 into huggingface:main May 6, 2024
21 checks passed

chenbin11200 mentioned this pull request Jun 18, 2024

[PEFT] make the trainer support resume checkpoint from a named adapter #28531 #28547

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer._load_from_checkpoint - support loading multiple Peft adapters #30505

Trainer._load_from_checkpoint - support loading multiple Peft adapters #30505

claralp commented Apr 26, 2024 •

edited

Loading

kashif commented Apr 26, 2024

claralp commented Apr 26, 2024

younesbelkada left a comment

lewtun left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented May 1, 2024

claralp commented May 2, 2024

kashif commented May 2, 2024

younesbelkada left a comment

claralp commented May 3, 2024 •

edited

Loading

claralp commented May 6, 2024

younesbelkada commented May 6, 2024

LysandreJik left a comment

muellerzr left a comment

Trainer._load_from_checkpoint - support loading multiple Peft adapters #30505

Trainer._load_from_checkpoint - support loading multiple Peft adapters #30505

Conversation

claralp commented Apr 26, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

kashif commented Apr 26, 2024

claralp commented Apr 26, 2024

younesbelkada left a comment

Choose a reason for hiding this comment

lewtun left a comment • edited Loading

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented May 1, 2024

claralp commented May 2, 2024

kashif commented May 2, 2024

younesbelkada left a comment

Choose a reason for hiding this comment

claralp commented May 3, 2024 • edited Loading

claralp commented May 6, 2024

younesbelkada commented May 6, 2024

LysandreJik left a comment

Choose a reason for hiding this comment

muellerzr left a comment

Choose a reason for hiding this comment

claralp commented Apr 26, 2024 •

edited

Loading

lewtun left a comment •

edited

Loading

claralp commented May 3, 2024 •

edited

Loading