-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5/N] pass the whole config to model #9983
Conversation
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
quant_config: Optional[QuantizationConfig] = None, | ||
lora_config: Optional[LoRAConfig] = None, | ||
vllm_config: VllmConfig, | ||
prefix: str = "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should not provide a default value for position_embedding
so people won't forget about it.
Is it intended that you add prefix
argument here? (Same question for the other models).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to unify the function signature, so that vllm_config
(required) and prefix
(optional) are enough to construct any model (as long as they have correct config).
this means we cannot have any other required parameters.
I think the position_embedding
here should be fine, because it is only used for internal classes. Code outside of this file should not be aware of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think prefix
should only be an argument if we have actually implemented quantization support for that model. Is this necessary for the model construction code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this particular model is a base class and will not be called directly by the model builder. As long as the subclass doesn't have position_embedding
in its argument list, it should not affect model builder code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
prefix
should only be an argument if we have actually implemented quantization support for that model. Is this necessary for the model construction code?
we should keep this uniform signature, and if people want to add quantization support later, it is easier. functionality support like lora, quantization, etc. should be done via checking the config, rather than checking the function signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some helper functions to VllmConfig
like raise_for_unsupported_quant
so we have a standardized way of explicitly indicating that the model doesn't support quantization, and call it in the top-level models that don't originally support prefix
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this, I'm fine with the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will ask neural magic folks to add this. I'm not familiar with how to tell if a model supports a quantization scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mgoin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per offline discussion, let's address this in another PR.
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per offline discussion, let's merge this first to unblock the next steps.
Signed-off-by: youkaichao <[email protected]> Signed-off-by: Loc Huynh <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: youkaichao <[email protected]>
contains code from #9978 , need to merge that first.