-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PEFT
] Fix PEFT multi adapters support
#26407
[PEFT
] Fix PEFT multi adapters support
#26407
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix. We will extend the PEFT CI to catch the integration issues earlier in the future.
Just a small correction, multiple active adapters were enabled in huggingface/peft#873, huggingface/peft#905 was for correctly handling requires_grad
.
@@ -357,6 +373,15 @@ def get_adapter_state_dict(self, adapter_name: Optional[str] = None) -> dict: | |||
if adapter_name is None: | |||
adapter_name = self.active_adapter() | |||
|
|||
if isinstance(adapter_name, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for this method to be called with a list of str? Wouldn't this require the user to explicitly pass that argument? If not, I think this check is not necessary. If it is a valid argument, then the type annotation should also be adjusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I don't think we should allow users to call this method with a list of str - I refactored a bit the logic of def active_adapter
to extend it for multi-adapter inference. let me know what do you think
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick fix, LGTM!
src/transformers/modeling_utils.py
Outdated
active_adapter = self.active_adapter() | ||
|
||
if isinstance(active_adapter, list): | ||
if len(active_adapter) > 1: | ||
logger.warning( | ||
"Multiple active adapters detected, will only consider the first active adapter. In order to save them all, please iteratively call `set_adapter()` on each" | ||
" adapter name and save them one by one manually. " | ||
) | ||
active_adapter = active_adapter[0] | ||
|
||
current_peft_config = self.peft_config[active_adapter] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should multi-adapter serialization be supported out of the box?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would advocate to educate users to manually save them one by one as this might require some imporant refactor in save_pretrained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also expose a new method in PeftAdapterMixin
save_adapters
to support that by iteratively calling save_pretrained
on all adapters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok understood, let's go with what you have right now then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your implementation!
In the case of multi-adapters, is a single adapter usually considered the "main" adapter, or are all adapters having the same impact? If so, I would advocate to switch completely to an API supporting multi-adapter rather than having a mix between both dictated by kwargs.
I would likely set all methods to plural rather than singular, as all methods correctly handle multi adapters.
return_multi_adapters (`bool`, *optional*, defaults to `True`): | ||
Whether to return a list of active adapters or not. If `False`, only the first adapter is returned. If | ||
`True`, returns the list of active adapters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure about this API? It would seem better to me to have active_adapter
changed to active_adapters
and to always default to returning the list of active adapters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BC would be handled by having another method active_adapter
that would log a warning + call active_adapters
under the hood
self.assertTrue(model.active_adapter() == ["default"]) | ||
self.assertTrue(model.active_adapter(return_multi_adapters=False) == "default") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change; are we ok with that breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can set the default return_multi_adapters
to True
, but there will be still the case of users that used the main branch of PEFT for multiple active adapters that will face a behaviour which is going to be different whether they use 2 different versions of transformers - cc @BenjaminBossan also what do you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think the most common case would be users with only a single active adapter, as multiple active adapters just landed on PEFT main and have not been advertised yet. Therefore, I would suggest to keep BC for those users and accept that maybe a tiny minority who uses multiple active adapters with the current transformers version will get a breaking change when they upgrade transformers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on these fixes so quickly. As I'm not a transformers expert, I won't approve/request changes. Still, I have a few comments, please take a look.
""" | ||
check_peft_version(min_version=MIN_PEFT_VERSION) | ||
if not self._hf_peft_config_loaded: | ||
raise ValueError("No adapter loaded. Please load an adapter first.") | ||
elif isinstance(adapter_name, list): | ||
for adapter in adapter_name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In situations like this, I always prefer to identify all incorrect entries, not just the first. Otherwise, the user fixes the first, then gets an error on 2nd, fixes the error on 2nd, gets third, etc. Much nicer to check all and runtime shouldn't really matter here.
missing = set(adapter_name) - set(self.peft_config)
if missing:
raise ValueError(
f"Following adapter(s) could not be found: {', '.join(missing)}. Please ...")
With a bit of extra work, this could even be merged with the code below which checks for a single adapter not being found.
@@ -265,17 +265,31 @@ def test_peft_add_multi_adapter(self): | |||
_ = model.generate(input_ids=dummy_input) | |||
|
|||
model.set_adapter("default") | |||
self.assertTrue(model.active_adapter() == "default") | |||
self.assertTrue(model.active_adapter() == ["default"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so the change is backwards incompatible for this specific case? Can we live with that?
|
||
# Logits comparison | ||
self.assertFalse( | ||
torch.allclose(logits_adapter_1.logits, logits_adapter_2.logits, atol=1e-6, rtol=1e-6) | ||
) | ||
self.assertFalse(torch.allclose(logits_original_model, logits_adapter_2.logits, atol=1e-6, rtol=1e-6)) | ||
|
||
model.set_adapter(["adapter-2", "default"]) | ||
self.assertTrue(model.active_adapter() == ["adapter-2", "default"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition to the test. Can we also check model.active_adapter()
and model.active_adapter(return_multi_adapters=False)
here?
Co-authored-by: Benjamin Bossan <[email protected]>
|
||
active_adapters = self.active_adapters() | ||
|
||
if isinstance(active_adapters, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this check is necessary as for previous PEFT versions active_adapter
is an str
Thanks for all the reviews, I think I have addressed most of them now, I propose to handle the multi-adapter saving in a follow up PR #26411 to be consistent with PEFT API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this LGTM now.
I noticed a small potential issue: If the next transformers version is released in 10 days or so, users will see that they can now have multiple active PEFT adapters -- however, this only works with PEFT main
, or v0.6
if it is released by then. Could this be an issue?
Thanks ! For that we could maybe sync the next PEFT release before / at the same time than the transformers release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @younesbelkada for quick fixes, LGTM!
# For previous PEFT versions | ||
if isinstance(active_adapters, str): | ||
active_adapters = [active_adapters] | ||
|
||
return active_adapters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# For previous PEFT versions | |
if isinstance(active_adapters, str): | |
active_adapters = [active_adapters] | |
return active_adapters | |
return active_adapters |
|
||
def active_adapter(self) -> str: | ||
def active_adapters(self) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that way it won't be BC --> for users that have previous PEFT version they will get an str whereas for newest PEFT versions they will get a list of str
|
||
return active_adapters | ||
|
||
def active_adapter(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it's a good idea to have both active_adapter
and active_adapters
:
- It's easier for the user to just have to remember one method and deal with the output type (we can also deprecate returning a single string in favor of returning a list going forward)
- Reduces the API surface - only a single source of truth
src/transformers/modeling_utils.py
Outdated
active_adapter = self.active_adapters() | ||
|
||
if len(active_adapter) > 1: | ||
logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of warning, I would raise here. It's important for the user to understand the difference between multiple active adapters and single. I would limit the functionality for multi active adapters heavily for now
|
||
return active_adapters | ||
|
||
def active_adapter(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed let's just deprecate active_adapter(...)
in favor of active_adapters
def active_adapter(self) -> str: | |
def active_adapter(self) -> str: | |
# deprecate ... | |
return self.active_adapters[0] |
As discussed offline I would propose the following:
|
I am ok to deprecate the
I meant that Agreed also on your second point! let's keep it simple |
Agree with Patrick! If we choose to go with
let's maybe put this PR #26411 to the side for now and focus on merging this PR with single-adapter saving first? WDYT @younesbelkada ? |
Sounds great @LysandreJik ! |
Co-authored-by: Patrick von Platen <[email protected]>
I should have addressed the final comments @patrickvonplaten , @LysandreJik , thanks for your time reviewing the PR ! |
active_adapter = self.active_adapters() | ||
|
||
if len(active_adapter) > 1: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if all tests pass!
Failing test seems unrelated - merging ! |
@@ -245,20 +245,27 @@ def add_adapter(self, adapter_config, adapter_name: Optional[str] = None) -> Non | |||
|
|||
self.set_adapter(adapter_name) | |||
|
|||
def set_adapter(self, adapter_name: str) -> None: | |||
def set_adapter(self, adapter_name: Union[List[str], str]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do the same in diffusers :-) https://github.com/huggingface/diffusers/pull/5151/files#r1338774571
* fix PEFT multi adapters support * refactor a bit * save pretrained + BC + added tests * Update src/transformers/integrations/peft.py Co-authored-by: Benjamin Bossan <[email protected]> * add more tests * add suggestion * final changes * adapt a bit * fixup * Update src/transformers/integrations/peft.py Co-authored-by: Patrick von Platen <[email protected]> * adapt from suggestions --------- Co-authored-by: Benjamin Bossan <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
* fix PEFT multi adapters support * refactor a bit * save pretrained + BC + added tests * Update src/transformers/integrations/peft.py Co-authored-by: Benjamin Bossan <[email protected]> * add more tests * add suggestion * final changes * adapt a bit * fixup * Update src/transformers/integrations/peft.py Co-authored-by: Patrick von Platen <[email protected]> * adapt from suggestions --------- Co-authored-by: Benjamin Bossan <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
What does this PR do?
With huggingface/peft#905 being merged, it added the support for multi-adapter inference (combining multiple adapters at inference).
The PR has introduced some changes that are not compatible anymore with peft integration of transformers, for example
active_adapter
is now a property method and returns a list. This PR adds the corresponding fixes to make everything compatible with the newest features of PEFT, while preserving backward compatiblitycc @BenjaminBossan @pacman100