Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024
#9006 opened Oct 1, 2024 by simon-mo
Open 22
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 9
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Bug]: vllm infer for Qwen2-VL-72B-Instruct-GPTQ-Int8 bug Something isn't working
#10650 opened Nov 26, 2024 by DoctorTar
1 task done
[Feature]: Mixtral manual head_dim feature request
#10649 opened Nov 26, 2024 by wavy-jung
1 task done
[Bug]: Llama 3.2 90b crash bug Something isn't working
#10648 opened Nov 26, 2024 by yessenzhar
1 task done
[RFC]: Support KV Cache Compaction RFC
#10646 opened Nov 25, 2024 by YaoJiayi
1 task done
[Bug]: GPU Memory Accounting Issue with Multiple vLLM Instances bug Something isn't working
#10643 opened Nov 25, 2024 by brokenlander
1 task done
[Bug]:The parameter gpu_memory_utilization does not take effect bug Something isn't working
#10637 opened Nov 25, 2024 by liutao053877
1 task done
[Bug]: GPU memory leak when using bad_words feature bug Something isn't working
#10630 opened Nov 25, 2024 by wsp317
1 task done
[Bug]: Crash with Qwen2-Audio Model in vLLM During Audio Processing bug Something isn't working
#10627 opened Nov 25, 2024 by jiahansu
1 task done
tracking torch.compile compatibility with lora serving bug Something isn't working
#10617 opened Nov 25, 2024 by youkaichao
1 task done
[Usage]: Does speculative decoding support pipeline parallelism ? usage How to use vllm
#10615 opened Nov 25, 2024 by wanghongyu2001
1 task done
tracking torch.compile compatibility with cpu offloading bug Something isn't working
#10612 opened Nov 25, 2024 by youkaichao
1 task done
[Bug]: GGUF Model Output Repeats Nonsensically bug Something isn't working
#10600 opened Nov 24, 2024 by Mayflyyh
1 task done
[Bug]: Memory allocation with echo=True bug Something isn't working
#10596 opened Nov 23, 2024 by ArtemBiliksin
1 task done
[Performance]: Cannot use FlashAttention-2 backend for Volta and Turing GPUs. performance Performance-related issues
#10592 opened Nov 23, 2024 by Weishaoya
1 task done
[Bug]: Duplicate request_id breaks the engine bug Something isn't working
#10583 opened Nov 22, 2024 by tjohnson31415
1 task done
ProTip! Adding no:label will show everything without a label.