-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Bugfix: Validate Model Input Length #12600
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@robertgshaw2-redhat Thanks for the PR. However, is this the desirable behavior? This PR basically lets the |
@WoosukKwon Within the server, this does simply fail the request and keep the server alive Server log sending 1 good, 1 bad, and 1 good request:
Error on the client for the failing length request:
|
@mgoin Thanks for testing. I've changed my comment from " |
@WoosukKwon you mean using Another comment though, I think for the input validation class of errors we should avoid logging the whole stacktrace, should be a single line. |
@njhill @robertgshaw2-redhat I merged the PR since it's better than what we have right now.
Agreed. Can we have a followup PR on this?
The case I'm worried about is when the user has a giant list of prompts, and the Also, I'm a bit worried about the backward compatibility, since in V0 we didn't raise an error for this case. |
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Isotr0py <[email protected]>
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Srikanth Srinivas <[email protected]>
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Felix Marty <[email protected]>
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
SUMMARY:
FIX #12567(link existing issues this PR will resolve)