-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: More VeRA tests, fix tests, more checks #1900
FIX: More VeRA tests, fix tests, more checks #1900
Conversation
- Fixes incorrect config for VeRA in a test - Add VeRA to multi-adapter tests - Add more checks on the VeRA A/B shapes The latter becomes necessary when we add more than one VeRA adapter. The shapes for VeRA A and B are only determined once, when the first VeRA adapter is created. After that, they are fixed. However, users may add a second VeRA adapter. As long as that adapter targets the same layers and has the same rank, we're good. But if it targets other, bigger layers, or if it has increased rank, the shapes of VeRA A and/or VeRA B will be too small, resulting in an error during the forward pass. To prevent this, we already check the shapes during initialization of the new adapter and raise an error right away.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Could you please review @dkopi? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, I've only added few comments regarding the raised message;
to easily fix this problem in vera I think the simplest solution would be to add optional config arguments that allow to specify the shared A/B shapes manually;
the other way would be to create separate "shared A/B" matrices per each adapter
src/peft/tuners/vera/layer.py
Outdated
@@ -100,6 +100,23 @@ def update_layer( | |||
# we can take any of the existing adapter's parameters, as they should all be identical | |||
vera_A_param = list(self.vera_A.values())[0] | |||
vera_B_param = list(self.vera_B.values())[0] | |||
|
|||
error_tmpl = ( | |||
"{} has a size of {} but {} is or greater required; this probably happened because an additional VeRA " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a typo - should be "{} has a size of {} but {} or greater is required [...]", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also have separate messages for incompatible r dimension or input/output dimensions;
can also give a hint that adding adapters in different order may resolve the problem
tests/test_initialization.py
Outdated
|
||
model = get_peft_model(self.get_model(), config0) | ||
# not full message but enough to identify the error | ||
msg = "vera_A has a size of 10 but 20 is or greater required" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
tests/test_initialization.py
Outdated
|
||
model = get_peft_model(self.get_model(), config0) | ||
# not full message but enough to identify the error | ||
msg = "vera_A has a size of 123 but 456 is or greater required" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@dkopi Thanks for the helpful comments, please check if I have addressed them sufficiently. CI is failing for unrelated reasons. Also a thought that I had: With your recent fix to the shape issue, we should be able to allow some models that formerly didn't work, right? peft/src/peft/utils/constants.py Lines 200 to 207 in 09358aa
I cannot test at the moment, but I think we can uncomment them. |
yeah, some/all of them should work now; I've already tested adapting all linear layers of gemma and phi and it worked |
@matthewdouglas LMK if you feel ready for reviewing this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Very minor comments.
"VeRA Same", | ||
"vera", | ||
VeraConfig, | ||
{"target_modules": ["lin0"], "init_weights": False}, | ||
{"target_modules": ["lin0"], "init_weights": False}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So essentially, these two adapters ("VeRA Same" and "vera") are the same. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"VeRA Same" is just the name of the test, "vera" is the name of the method being used. I added a comment at the top that describes the items of this tuple for clarification.
The two adapters here have the same config, but they are two different adapters.
tests/test_initialization.py
Outdated
|
||
model = get_peft_model(self.get_model(), config0) | ||
# not full message but enough to identify the error | ||
msg = "vera_A has a size of 123 but 456 or greater is required" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we programmatically determine 123 and 456?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the code to make it obvious where the numbers come from.
tests/test_initialization.py
Outdated
|
||
model = get_peft_model(self.get_model(), config0) | ||
# not full message but enough to identify the error | ||
msg = "vera_A has a size of 10 but 20 or greater is required" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. Can we programmatically determine the magic numbers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the code to make it obvious where the numbers come from.
@sayakpaul Your points should be addressed, LMK if you want to do another review or if the PR can be merged. |
Thanks for clarifying the comments. |
The latter becomes necessary when we add more than one VeRA adapter. The shapes for VeRA A and B are only determined once, when the first VeRA adapter is created. After that, they are fixed. However, users may add a second VeRA adapter. As long as that adapter targets the same layers and has the same rank, we're good. But if it targets other, bigger layers, or if it has increased rank, the shapes of VeRA A and/or VeRA B will be too small, resulting in an error during the forward pass. To prevent this, we already check the shapes during initialization of the new adapter and raise an error right away.