Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[v1][WIP] Metrics & Stats prototype
#10651 opened Nov 26, 2024 by rickyyx Draft
[Core] Integrate Fastsafetensor loader for loading model weights ci/build documentation Improvements or additions to documentation
#10647 opened Nov 26, 2024 by manish-sethi Draft
Check bnb_4bit_quant_storage for bitsandbytes
#10642 opened Nov 25, 2024 by mgoin Loading…
[fix] Correct num_accepted_tokens counting ready ONLY add when PR is ready to merge/full CI is needed
#10604 opened Nov 24, 2024 by KexinFeng Loading…
[Misc]Further reduce BNB static variable
#10597 opened Nov 24, 2024 by jeejeelee Draft
2 tasks
[Interleaved ATTN] Support for Mistral-8B
#10591 opened Nov 23, 2024 by patrickvonplaten Loading…
[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers ready ONLY add when PR is ready to merge/full CI is needed
#10587 opened Nov 23, 2024 by xuechendi Loading…
[V1] Refactor model executable interface for multimodal models
#10570 opened Nov 22, 2024 by ywang96 Loading…
14 tasks done
[Docs] Add dedicated tool calling page to docs documentation Improvements or additions to documentation
#10554 opened Nov 21, 2024 by mgoin Loading…
Add Sageattention backend
#10532 opened Nov 21, 2024 by flozi00 Loading…
ProTip! Exclude everything labeled bug with -label:bug.