FEAT: Add `safe_merge` option in `merge` #1001

younesbelkada · 2023-10-06T15:28:10Z

What does this PR do?

Analogous PR to huggingface/diffusers#5316

Some users that use diffusion models can face strange issues when merging the adapter weights inside the base model. This PR is the PEFT equivalent of @patrickvonplaten 's PR on diffusers

I would advocate to default safe_merge to False as this PR adds an overhead. in fact, it is preferable to first copy the merged tensor, check if there is any nan value there and raise a proper ValueError in case there is a nan. Otherwise the potential nan would be already propagated to the merged weights before raising the error

Also this PR is perfectly backward compatible as it preserves all the previous behaviour

Added also nice tests

cc @pacman100 @BenjaminBossan @sayakpaul @patrickvonplaten FYI --> on huggingface/diffusers#5151 we would just pass safe_merge=safe_merge in module.merge

HuggingFaceDocBuilderDev · 2023-10-06T15:32:43Z

The documentation is not available anymore as the PR was closed or merged.

BenjaminBossan · 2023-10-06T15:49:25Z

This looks like a useful feature to have, thanks for the addition.

For my understanding, the NaN could also be caused by the addition of the delta weights, even if those don't contain any NaN themselves, and that's why you perform the check on the merged weights, not on the delta weights, right? The main reason I'm asking is because if we only do the check on the delta weights, we wouldn't need to create a copy of the original weights.

Btw. IIRC the test that was failing was not the flaky one, so there might be some fixing needed. Could be caused by the changed line if active_adapter not in self._active_adapter:.

younesbelkada · 2023-10-06T15:53:39Z

Indeed @BenjaminBossan I believe the overflow (nan) is purely caused by the sum between the adapter weights and the base model, I also think this usually happens under float16 regime.
It could be potentially caused by nan being in the delat weights but it is also likely that the sum causes the overflow afterwards, therefore I think it is safer to perform the check this way. Let me know what do you think
Regarding your second point that's correct as well !

BenjaminBossan

Only some small comments, the rest looks good, thanks for adding this useful check.

src/peft/tuners/lora/layer.py

tests/testing_common.py

Co-authored-by: Benjamin Bossan <[email protected]>

…to safe-merge

BenjaminBossan

Thanks for addressing the comments. I found two more type annotations that need fixing which I missed the first time around, otherwise LGTM.

src/peft/tuners/lora/layer.py

Co-authored-by: Benjamin Bossan <[email protected]>

younesbelkada · 2023-10-09T10:55:57Z

Thanks very much for all the reviews @BenjaminBossan !

BenjaminBossan

Thanks a lot, this now looks ready to me (once CI is green).

We may want to add the option for safe merging to the other adapters too. Maybe we can create an issue so that it's not forgotten?

younesbelkada · 2023-10-09T11:03:59Z

I think that we should add it in this PR to make things consistent, let me work on that!

patrickvonplaten · 2023-10-09T11:41:13Z

Cool!

…to safe-merge

BenjaminBossan

Thanks a lot for adding the safe merging feature to the other methods. It is still missing for LoRA bnb layers, which support merging. It would be fine with me if they are added in a separate PR though.

I noticed some more things only now, sorry for not noticing earlier:

The error message

NaNs detected in the merged weights. The Lora adapter {active_adapter} seems to be broken and should be removed

is a bit confusing IMO. The issue I see is with suggesting to remove it, because (if I'm not mistaken) it is totally possible for the adapter layer to be working in forward when applied separately from the original weights, and only encountering NaNs after merging, since the mathematical operation is not identical. E.g. two weight parameters could be overflowing when added, but when they are both first multiplied by the activation and only then added, they might not overflow anymore.

Therefore, I wouldn't ask to remove the adapter, as it may work. Instead, I would change the message to just say that this adapter cannot be merged safely, without any further suggestion. WDYT?

torch.isnan check

The next issue I only noticed just now is that I think we should not check with torch.isnan because it doesn't include torch.inf. Instead, torch.isfinite(x).all() should work for both torch.inf and torch.nan. WDYT? If you agree, maybe the test could be extended to also include module.data[0] = torch.inf.

lora bnb layers

use torch.isfinite(x).all() instead

BenjaminBossan · 2023-10-09T13:54:07Z

src/peft/tuners/ia3/model.py

@@ -287,7 +287,7 @@ def _prepare_adapter_config(self, peft_config, model_config):
            ]
        return peft_config

-    def merge_and_unload(self):
+    def merge_and_unload(self, safe_merge: bool = False):


Please extend the docstring.

younesbelkada · 2023-10-09T14:44:20Z

@BenjaminBossan all the proposed suggestions sound great to me ! Will work on that

younesbelkada · 2023-10-09T14:55:20Z

I have adapted the changes accordingly and added a test case with inf, let me know what do you think!

BenjaminBossan

Looks great. Thanks for addressing the remaining issues. From my point of view, it can be merged once CI is green.

add safe_merge option in merge

d720785

younesbelkada requested a review from BenjaminBossan October 6, 2023 15:29

younesbelkada mentioned this pull request Oct 6, 2023

Fix loading broken LoRAs that could give NaN huggingface/diffusers#5316

Merged

2 tasks

younesbelkada requested a review from pacman100 October 6, 2023 15:29

oops

899f4f9

Merge remote-tracking branch 'upstream/main' into safe-merge

409b593

BenjaminBossan reviewed Oct 9, 2023

View reviewed changes

younesbelkada and others added 3 commits October 9, 2023 12:25

Apply suggestions from code review

ce9c577

Co-authored-by: Benjamin Bossan <[email protected]>

address final comments

7322a6b

Merge branch 'safe-merge' of https://github.com/younesbelkada/peft in…

6a3a640

…to safe-merge

younesbelkada requested a review from BenjaminBossan October 9, 2023 10:32

BenjaminBossan reviewed Oct 9, 2023

View reviewed changes

src/peft/tuners/lora/layer.py Outdated Show resolved Hide resolved

src/peft/tuners/lora/layer.py Outdated Show resolved Hide resolved

younesbelkada and others added 2 commits October 9, 2023 12:55

Update src/peft/tuners/lora/layer.py

f3d4e31

Co-authored-by: Benjamin Bossan <[email protected]>

Update src/peft/tuners/lora/layer.py

debb77c

Co-authored-by: Benjamin Bossan <[email protected]>

BenjaminBossan approved these changes Oct 9, 2023

View reviewed changes

younesbelkada and others added 2 commits October 9, 2023 11:23

add it for ia3

2574d44

Merge branch 'main' into safe-merge

28b4d5d

younesbelkada added 4 commits October 9, 2023 11:43

add it for adalora

922b1aa

up

a9b6faf

Merge branch 'safe-merge' of https://github.com/younesbelkada/peft in…

8e4f6e7

…to safe-merge

revert for loha

db59c2d

younesbelkada requested a review from BenjaminBossan October 9, 2023 11:47

younesbelkada added 2 commits October 9, 2023 12:24

style

516a171

fix CI

9a6973b

BenjaminBossan requested changes Oct 9, 2023

View reviewed changes

younesbelkada added 3 commits October 9, 2023 14:51

adapt from suggestions

7680eb4

add tests

a72bc2d

up

d0e5e2f

younesbelkada requested a review from BenjaminBossan October 9, 2023 14:55

BenjaminBossan approved these changes Oct 9, 2023

View reviewed changes

younesbelkada merged commit c2c544d into huggingface:main Oct 9, 2023
11 checks passed

younesbelkada deleted the safe-merge branch October 9, 2023 16:28

younesbelkada mentioned this pull request Oct 10, 2023

[core / LoRA] Add safe_merge to bnb layers #1009

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add `safe_merge` option in `merge` #1001

FEAT: Add `safe_merge` option in `merge` #1001

younesbelkada commented Oct 6, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 6, 2023 •

edited

Loading

BenjaminBossan commented Oct 6, 2023

younesbelkada commented Oct 6, 2023 •

edited

Loading

BenjaminBossan left a comment

BenjaminBossan left a comment

younesbelkada commented Oct 9, 2023

BenjaminBossan left a comment

younesbelkada commented Oct 9, 2023

patrickvonplaten commented Oct 9, 2023

BenjaminBossan left a comment

BenjaminBossan Oct 9, 2023

younesbelkada commented Oct 9, 2023

younesbelkada commented Oct 9, 2023

BenjaminBossan left a comment

FEAT: Add safe_merge option in merge #1001

FEAT: Add safe_merge option in merge #1001

Conversation

younesbelkada commented Oct 6, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Oct 6, 2023 • edited Loading

BenjaminBossan commented Oct 6, 2023

younesbelkada commented Oct 6, 2023 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan left a comment

Choose a reason for hiding this comment

younesbelkada commented Oct 9, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

younesbelkada commented Oct 9, 2023

patrickvonplaten commented Oct 9, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan Oct 9, 2023

Choose a reason for hiding this comment

younesbelkada commented Oct 9, 2023

younesbelkada commented Oct 9, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

FEAT: Add `safe_merge` option in `merge` #1001

FEAT: Add `safe_merge` option in `merge` #1001

younesbelkada commented Oct 6, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 6, 2023 •

edited

Loading

younesbelkada commented Oct 6, 2023 •

edited

Loading