Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we have to delete the PiSSA adapter after save_pissa_as_lora #1860

Closed
2 of 4 tasks
hiyouga opened this issue Jun 15, 2024 · 7 comments · Fixed by #1933
Closed
2 of 4 tasks

Do we have to delete the PiSSA adapter after save_pissa_as_lora #1860

hiyouga opened this issue Jun 15, 2024 · 7 comments · Fixed by #1933

Comments

@hiyouga
Copy link
Contributor

hiyouga commented Jun 15, 2024

System Info

peft v0.11.1
torch 2.3.0
linux

Who can help?

@BenjaminBossan

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

I want to use the pissa adapter for other tasks after saving the model, but I cannot continue using the fine-tuned model.

peft/src/peft/peft_model.py

Lines 253 to 276 in 076561b

def save_mutated_as_lora(peft_config, path_initial_model_for_weight_conversion, output_state_dict, kwargs):
if not any(
str(peft_config.init_lora_weights).lower().startswith(prefix) for prefix in ["pissa", "olora", "true"]
):
warnings.warn(
"`path_initial_model_for_weight_conversion` only works for converting a PiSSA or OLoRA adapter to a LoRA adapter"
)
initial_adapter = os.path.basename(path_initial_model_for_weight_conversion)
self.load_adapter(
os.path.dirname(path_initial_model_for_weight_conversion),
subfolder=initial_adapter,
adapter_name=initial_adapter,
)
if any(
str(self.peft_config[initial_adapter].init_lora_weights).lower().startswith(prefix)
for prefix in ["pissa", "olora"]
):
raise ValueError(
"The `init_lora_weights` parameter of the initial adapter should be set to `True`. "
"Otherwise, `self.load_adapter` will subtract the decomposed values again based on the residual model."
)
output_state_dict = self.base_model.subtract_mutated_init(output_state_dict, initial_adapter, kwargs)
self.delete_adapter(adapter_name)
return output_state_dict

Expected behavior

The PiSSA adapter can remain for further use.

@BenjaminBossan
Copy link
Member

Okay, so if I understand correctly, you would like to skip the line self.delete_adapter(adapter_name) so that you can continue working with that adapter. I haven't thought about this use case. I can't quite remember if this was discussed with the PiSSA authors, but probably that line was included to avoid keeping two adapters in memory. But I see how this can be annoying.

Right now, the only way to really continue with this adapter is to reload the model, but I think we could make the change and if users want, they can delete the adapter themselves. Pinging @tokenizer-decode since they had some thoughts around this topic as well.

@hiyouga
Copy link
Contributor Author

hiyouga commented Jun 17, 2024

sure, a typical case is to evaluate the model after we save and convert the pissa adapter. Regarding the issue of maintaining two adapters in memory, should we delete the initial_adapter instead of the pissa adapter?

@tokenizer-decode
Copy link
Contributor

tokenizer-decode commented Jun 17, 2024

I think the simplest approach that would work for you is just to comment out self.delete_adapter(adapter_name) line.

Regarding the issue of maintaining two adapters in memory, should we delete the initial_adapter instead of the pissa adapter?

If you want the use all of your memory, sure you can do that. I did not try that tbh, but since save_mutated_as_lora reloads the initial adapter, that should work just fine.

Regarding @BenjaminBossan

Right now, the only way to really continue with this adapter is to reload the model, but I think we could make the change and if users want, they can delete the adapter themselves.

Sure. But the only way for user to talk to save_mutated_as_lora is through save_pretrained. If this is necessary, you can just pass another option to save_pretrained such as delete_adapter. But you know my stance on adding other options to save_pretrained :)

Also if you are just experimenting, we'd be happy if you try out OLoRA. It's similar to PiSSA and that's why you see a unified approach there. I don't know how would it work out for your case btw. I don't have a comprehensive comparison between these methods. @hiyouga

@BenjaminBossan
Copy link
Member

BenjaminBossan commented Jun 17, 2024

sure, a typical case is to evaluate the model after we save and convert the pissa adapter.

You could first evaluate and then save the model, but I would also prefer first saving and then evaluating, just in case the process crashes during evaluation.

Regarding the issue of maintaining two adapters in memory, should we delete the initial_adapter instead of the pissa adapter?

True. Honestly, I think I'd prefer not deleting any adapter and leaving this up to the user to decide. It does require more memory if the adapter is not deleted, but it doesn't change peak memory.

Sure. But the only way for user to talk to save_mutated_as_lora is through save_pretrained. If this is necessary, you can just pass another option to save_pretrained such as delete_adapter. But you know my stance on adding other options to save_pretrained :)

Yes, I agree that we should not add more options there.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

Not stale, I hope that I can work on this soon.

BenjaminBossan added a commit to BenjaminBossan/peft that referenced this issue Jul 17, 2024
Resolves huggingface#1860

As discussed in that issue, it's not user friendly to delete the default
adapter of a PiSSA/OLoRA model after calling save_pretrained with weight
conversion. Instead, it is much more intuitive to delete the initial
adapter instead, since it is loaded inside the method and not by the
user, so it's really an implementation detail.

Apart from this, I made the following related changes:

- Put everything in a try ... finally to ensure that the initial adapter
  does not hang around if there is an error (thus not hogging memory).
- Renamed initial_adapter to initial_adapter_name, to make it clear that
  this is the name and not the adapter itself.
@BenjaminBossan
Copy link
Member

I created a PR to fix this: #1933. Hopefully I can a release soon and I thought it would be good to get this in before that, as technically, it's a breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants