-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1] EngineArgs for better config handling for v1 #10382
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty clean to me!
@WoosukKwon please review to see if this format is desired to you. Also what's the current best practice to test this in v1?
vllm/engine/arg_utils.py
Outdated
assert ( | ||
usage_context is not None | ||
), "usage_context must be provided for V1EngineArgs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WoosukKwon We need to pass usage_context because the default value depends on it, but this argument looks a bit weird to me. Do you have a better way to decide the default max_num_batched_tokens
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. cc @WoosukKwon @robertgshaw2-neuralmagic
@rickyyx could you rebase and see if the errors go away? |
d3ee119
to
db20919
Compare
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: rickyx <[email protected]>
db20919
to
c3efa25
Compare
Test failures look related - taking a look |
Signed-off-by: rickyx <[email protected]>
Signed-off-by: rickyx <[email protected]>
Test failures look unrelated |
Signed-off-by: rickyx <[email protected]>
Remove the dynamic override of Thanks for the suggestion. |
Signed-off-by: rickyx <[email protected]>
Signed-off-by: rickyx <[email protected]>
Test failures should be unrelated. |
Hand over to @youkaichao for final review and force merge. |
can you merge main to see if these errors disappear? |
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: rickyx <[email protected]>
Failures look unrelated - but I can rebase again. cc @youkaichao |
@rickyyx thanks for the great work! |
Signed-off-by: rickyx <[email protected]> Signed-off-by: Andrew Feldman <[email protected]>
Signed-off-by: rickyx <[email protected]>
Signed-off-by: rickyx <[email protected]>
Signed-off-by: rickyx <[email protected]>
This allows:
VLLM_USE_V1
This PRs:
create_engine_config
to include usage context, which is currently needed for v1 arg's update._override_v1_args
to override some of the EngineArg's value before creation of engine config_override_v1_configs
to override the generated engine config.