vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4.7k
Star 30.8k

Code
Issues 1.7k
Pull requests 380
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

380 Open 4,471 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[v1][WIP] Metrics & Stats prototype

#10651 opened Nov 26, 2024 by rickyyx • Draft

[Core] Integrate Fastsafetensor loader for loading model weights ci/build documentation

Improvements or additions to documentation

#10647 opened Nov 26, 2024 by manish-sethi • Draft

Check bnb_4bit_quant_storage for bitsandbytes

#10642 opened Nov 25, 2024 by mgoin

Loading…

[V1] VLM - Support running the mm_mapper preprocessor in the frontend process frontend needs-rebase

#10640 opened Nov 25, 2024 by alexm-neuralmagic

Loading…

[Frontend] don't block event loop in tokenization (preprocess) in OpenAI compatible server frontend

#10635 opened Nov 25, 2024 by tomeras91

Loading…

[Misc] Allow LoRA to adaptively increase rank and remove possible_max_ranks

#10623 opened Nov 25, 2024 by JinhyunBang

Loading…

[core] improve cpu offloading implementation

#10609 opened Nov 24, 2024 by youkaichao • Draft

[Core][Bugfix] Use correct device to initialize GPU data during CUDA-graph-capture

#10608 opened Nov 24, 2024 by IdoAsraff

Loading…

[fix] Correct num_accepted_tokens counting ready

ONLY add when PR is ready to merge/full CI is needed

#10604 opened Nov 24, 2024 by KexinFeng

Loading…

[Misc]Further reduce BNB static variable

#10597 opened Nov 24, 2024 by jeejeelee • Draft

2 tasks

[Interleaved ATTN] Support for Mistral-8B

#10591 opened Nov 23, 2024 by patrickvonplaten

Loading…

[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers ready

ONLY add when PR is ready to merge/full CI is needed

#10587 opened Nov 23, 2024 by xuechendi

Loading…

[WIP] V1 LoRA support needs-rebase

#10579 opened Nov 22, 2024 by varun-sundar-rabindranath • Draft

[Core] Update to outlines > 0.1.4 ci/build

#10576 opened Nov 22, 2024 by russellb • Draft

[ Kernels ] [ AMD ] Add Fused MoE Configs

#10574 opened Nov 22, 2024 by robertgshaw2-neuralmagic • Draft

[V1] Refactor model executable interface for multimodal models

#10570 opened Nov 22, 2024 by ywang96

Loading…

14 tasks done

[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)

#10565 opened Nov 22, 2024 by SanjuCSudhakaran

Loading…

[Model] Added GLM-4 series hf format model support vllm==0.6.4

#10561 opened Nov 22, 2024 by sixsixcoder

Loading…

[Benchmark] Benchmark structured output with datasets

#10557 opened Nov 22, 2024 by xuechendi

Loading…

[Docs] Add dedicated tool calling page to docs documentation

Improvements or additions to documentation

#10554 opened Nov 21, 2024 by mgoin

Loading…

[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server frontend

#10546 opened Nov 21, 2024 by angkywilliam

Loading…

[Distributed] Tensor Parallel RMSNorm

#10542 opened Nov 21, 2024 by tlrmchlsmth • Draft

Add Sageattention backend

#10532 opened Nov 21, 2024 by flozi00

Loading…

[core] overhaul memory profiling and fix backward compatibility needs-rebase

#10511 opened Nov 21, 2024 by youkaichao

Loading…

Turn on V1 for H200 build ci/build perf-benchmarks

#10505 opened Nov 21, 2024 by simon-mo

Loading…

Previous 1 2 3 4 5 … 15 16 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly