-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855
Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thanks!
CI seems to fail:
|
…limit_mm_per_prompt Signed-off-by: TheDropZone <[email protected]>
Signed-off-by: TheDropZone <[email protected]>
ca93264
to
fade343
Compare
@mudler |
Signed-off-by: TheDropZone <[email protected]>
606511f
to
19d1188
Compare
19d1188
to
66eae14
Compare
Signed-off-by: TheDropZone <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, thanks!
Description
This allows for additional configuration of the vLLM backend, specifically the following options
see: https://docs.vllm.ai/en/latest/serving/engine_args.html
limit_mm_per_prompt
is a nested object with config for image, video, and audio modality types. See below (and in commit) for example on bumping the default limit (1)The resulting vllm-based config file would look like (skipping existing config)
Notes for Reviewers
Currently, working on getting this set up locally to fully test out these changes. I followed the local dev setup notes, but ran out of disk space on my machine (so, will need to clean up some space...) . These changes were made locally on my docker setup (localai:v2.25.0-cublas-cuda12) to enable running InternVL 2.5 AWQ via vLLM, and required the config changes included.
I am glad to rename/re-format the config variables as needed. Just let me know