feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

TheDropZone · 2025-02-17T21:31:27Z

Description

This allows for additional configuration of the vLLM backend, specifically the following options

--disable-log-requests
- This allows disable request logging for your deployment
--dtype
- This allows configuring the datatype that vLLM will use. When using awq, bfloat16 is not supported, this you need to adjust
--limit-mm-per-prompt
- This allows you to up the limit on attached media per request per modality. The default is 1 image/video/audio entry per request, even though many models support more

see: https://docs.vllm.ai/en/latest/serving/engine_args.html

limit_mm_per_prompt is a nested object with config for image, video, and audio modality types. See below (and in commit) for example on bumping the default limit (1)

The resulting vllm-based config file would look like (skipping existing config)

---
backend: vllm
.....
quantization: "awq"
dtype: "float16" # new dtype config
...
disable_log_stats: true # new config to disable request logging 
limit_mm_per_prompt:  # allow for requests to have more than just 1 media object per modality (which is default)
  image: 2
  video: 2
  audio: 2

Notes for Reviewers
Currently, working on getting this set up locally to fully test out these changes. I followed the local dev setup notes, but ran out of disk space on my machine (so, will need to clean up some space...) . These changes were made locally on my docker setup (localai:v2.25.0-cublas-cuda12) to enable running InternVL 2.5 AWQ via vLLM, and required the config changes included.

I am glad to rename/re-format the config variables as needed. Just let me know

netlify · 2025-02-17T21:31:45Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`f498d40`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/67b492bb05358200083dcc41
😎 Deploy Preview	https://deploy-preview-4855--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mudler

nice, thanks!

mudler · 2025-02-18T09:18:05Z

CI seems to fail:

Error: ../backend/options.go:162:25: syntax error: unexpected name c in composite literal; possibly missing comma or }
Error: ../backend/options.go:165:69: syntax error: unexpected ) at end of statement
Error: ../backend/options.go:200:1: syntax error: non-declaration statement outside function body

…limit_mm_per_prompt Signed-off-by: TheDropZone <[email protected]>

Signed-off-by: TheDropZone <[email protected]>

TheDropZone · 2025-02-18T13:35:36Z

CI seems to fail:

Error: ../backend/options.go:162:25: syntax error: unexpected name c in composite literal; possibly missing comma or }
Error: ../backend/options.go:165:69: syntax error: unexpected ) at end of statement
Error: ../backend/options.go:200:1: syntax error: non-declaration statement outside function body

@mudler
Found and fixed a missing colon in the options.go file. (also, added sign-offs to commits)

Signed-off-by: TheDropZone <[email protected]>

mudler

Looking good, thanks!

github-actions bot added the area/ai-model label Feb 17, 2025

mudler previously approved these changes Feb 18, 2025

View reviewed changes

TheDropZone added 2 commits February 18, 2025 08:21

Adding the following vLLM config options: disable_log_status, dtype, …

f0f2c87

…limit_mm_per_prompt Signed-off-by: TheDropZone <[email protected]>

using " marks in the config.yaml file

fade343

Signed-off-by: TheDropZone <[email protected]>

TheDropZone force-pushed the vllm-additional-config branch from ca93264 to fade343 Compare February 18, 2025 13:24

TheDropZone dismissed mudler’s stale review via 20bbb27 February 18, 2025 13:33

adding in missing colon

66eae14

Signed-off-by: TheDropZone <[email protected]>

TheDropZone force-pushed the vllm-additional-config branch from 606511f to 19d1188 Compare February 18, 2025 13:58

github-actions bot added the dependencies label Feb 18, 2025

TheDropZone force-pushed the vllm-additional-config branch from 19d1188 to 66eae14 Compare February 18, 2025 13:59

Merge remote-tracking branch 'origin/master' into vllm-additional-config

f498d40

Signed-off-by: TheDropZone <[email protected]>

mudler approved these changes Feb 18, 2025

View reviewed changes

mudler merged commit 6a6e1a0 into mudler:master Feb 18, 2025
22 of 23 checks passed

mudler added enhancement New feature or request and removed area/ai-model dependencies labels Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

TheDropZone commented Feb 17, 2025 •

edited

Loading

netlify bot commented Feb 17, 2025 •

edited

Loading

mudler left a comment

mudler commented Feb 18, 2025

TheDropZone commented Feb 18, 2025 •

edited

Loading

mudler left a comment

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) #4855

Conversation

TheDropZone commented Feb 17, 2025 • edited Loading

netlify bot commented Feb 17, 2025 • edited Loading

✅ Deploy Preview for localai ready!

mudler left a comment

Choose a reason for hiding this comment

mudler commented Feb 18, 2025

TheDropZone commented Feb 18, 2025 • edited Loading

mudler left a comment

Choose a reason for hiding this comment

TheDropZone commented Feb 17, 2025 •

edited

Loading

netlify bot commented Feb 17, 2025 •

edited

Loading

TheDropZone commented Feb 18, 2025 •

edited

Loading