You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are planning a release that will include vllm 0.6.2 within the next 2 weeks. In the meantime, you can try providing a requirements.txt with vllm==0.6.x and leverage a later version of vllm that way. If you go this route, you should also set OPTION_ROLLING_BATCH=vllm environment variable to force usage of vllm
Concise Description:
vLLM v0.6.0 provides 2.7x throughput improvement and 5x latency reduction over the previous version (v0.5.3)
DLC image/dockerfile:
763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-lmi11.0.0-cu124
763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.29.0-neuronx-sdk2.19.1
Is your feature request related to a problem? Please describe.
Improve the performance of LMI containters
Describe the solution you'd like
Update vLLM library in LMI containers to v0.6.0
The text was updated successfully, but these errors were encountered: