-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. #1919
Conversation
…ayer Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Thanks for this update. Could you please run |
Ok. Sorry, I forgot it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is failing on Python 3.8 because of type annotation syntax. Adding from __future__ import annotations
should fix that. I assume that you tried out that the changes you made to the megatron layer work?
src/peft/tuners/lora/tp_layer.py
Outdated
@@ -108,7 +116,7 @@ def update_layer( | |||
else: | |||
lora_dropout_layer = nn.Identity() | |||
|
|||
self.lora_dropout[adapter_name] = lora_dropout_layer | |||
self.lora_dropout.update(nn.ModuleDict({adapter_name: lora_dropout_layer})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be necessary, right?
The But I forgot to add the |
I don't really have experience with megatron, so I'm not sure if these methods would just work when copied 1:1. Just to be sure, did you test with your setup that nothing breaks with these changes? If yes, I think we can merge and in the future try to be more careful to keep |
Ha, yes, in our code we added the patch like this: setattr(peft.tuners.lora.LoraLayer, 'merge', peft.tuners.lora.Linear.merge)
setattr(peft.tuners.lora.LoraLayer, 'unmerge', peft.tuners.lora.Linear.unmerge)
setattr(peft.tuners.lora.LoraLayer, 'get_delta_weight', peft.tuners.lora.Linear.get_delta_weight) It works good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works good.
Okay, great that you tested it. LGTM.
There was a previous commit that moved the merge and other functions in LoraLayer to Linear, etc., but the LoraParallelLinear in tp_layer was missed.