Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

Merged
merged 4 commits into from
Feb 18, 2025

Conversation

TheDropZone
Copy link
Contributor

@TheDropZone TheDropZone commented Feb 17, 2025

Description

This allows for additional configuration of the vLLM backend, specifically the following options

  • --disable-log-requests
    • This allows disable request logging for your deployment
  • --dtype
    • This allows configuring the datatype that vLLM will use. When using awq, bfloat16 is not supported, this you need to adjust
  • --limit-mm-per-prompt
    • This allows you to up the limit on attached media per request per modality. The default is 1 image/video/audio entry per request, even though many models support more

see: https://docs.vllm.ai/en/latest/serving/engine_args.html

limit_mm_per_prompt is a nested object with config for image, video, and audio modality types. See below (and in commit) for example on bumping the default limit (1)

The resulting vllm-based config file would look like (skipping existing config)

---
backend: vllm
.....
quantization: "awq"
dtype: "float16" # new dtype config
...
disable_log_stats: true # new config to disable request logging 
limit_mm_per_prompt:  # allow for requests to have more than just 1 media object per modality (which is default)
  image: 2
  video: 2
  audio: 2

Notes for Reviewers
Currently, working on getting this set up locally to fully test out these changes. I followed the local dev setup notes, but ran out of disk space on my machine (so, will need to clean up some space...) . These changes were made locally on my docker setup (localai:v2.25.0-cublas-cuda12) to enable running InternVL 2.5 AWQ via vLLM, and required the config changes included.

I am glad to rename/re-format the config variables as needed. Just let me know

Copy link

netlify bot commented Feb 17, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit f498d40
🔍 Latest deploy log https://app.netlify.com/sites/localai/deploys/67b492bb05358200083dcc41
😎 Deploy Preview https://deploy-preview-4855--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mudler
mudler previously approved these changes Feb 18, 2025
Copy link
Owner

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks!

@mudler
Copy link
Owner

mudler commented Feb 18, 2025

CI seems to fail:

Error: ../backend/options.go:162:25: syntax error: unexpected name c in composite literal; possibly missing comma or }
Error: ../backend/options.go:165:69: syntax error: unexpected ) at end of statement
Error: ../backend/options.go:200:1: syntax error: non-declaration statement outside function body

@TheDropZone
Copy link
Contributor Author

TheDropZone commented Feb 18, 2025

CI seems to fail:

Error: ../backend/options.go:162:25: syntax error: unexpected name c in composite literal; possibly missing comma or }
Error: ../backend/options.go:165:69: syntax error: unexpected ) at end of statement
Error: ../backend/options.go:200:1: syntax error: non-declaration statement outside function body

@mudler
Found and fixed a missing colon in the options.go file. (also, added sign-offs to commits)

Signed-off-by: TheDropZone <[email protected]>
Copy link
Owner

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks!

@mudler mudler merged commit 6a6e1a0 into mudler:master Feb 18, 2025
22 of 23 checks passed
@mudler mudler added enhancement New feature or request and removed area/ai-model dependencies labels Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants