[VLM] Fully dynamic prompt replacement in merged input processor #11199

DarkLight1337 · 2024-12-14T10:08:58Z

This PR enables the prompt replacement sequence (which can be text or a list of token IDs) to be fully computed based on the input. It also improves the placeholder search logic to be able to match the exact placeholder tokens for each multi-modal input. Furthermore, the input processor is now applied automatically when generating the dummy data, so developers only need to specify the raw input multi-modal data instead of the processed data.

This fixes an issue in Pixtral-HF preprocessing where the PromptReplacement is incorrect.

Signed-off-by: DarkLight1337 <[email protected]>

github-actions · 2024-12-14T10:09:10Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

vllm/multimodal/processing.py

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 · 2024-12-14T12:51:04Z

vllm/utils.py

+        if strict:
+            return key in self.data


This fixes false positive warnings being emitted when using Mantis model, arising from both Mantis and its base class (Llava) having corresponding items in the registry.

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py

LGTM! Thanks for fixing this!

tests/multimodal/test_processing.py

vllm/model_executor/models/llava.py

DarkLight1337 · 2024-12-14T14:13:26Z

I found that the processor in this PR doesn't work for Phi-3-Vision (though it passes for Phi-3.5-Vision). Looks like there is still a need to compute the number of image tokens according to the image...

vllm/model_executor/models/phi3v.py

Signed-off-by: DarkLight1337 <[email protected]>

vllm/model_executor/models/phi3v.py

Signed-off-by: DarkLight1337 <[email protected]>

mgoin

Thanks so much for the fix

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]>

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Bowen Wang <[email protected]>

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]>

Enable fully dynamic prompt replacement

74a25e3

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 requested review from comaniac and Isotr0py December 14, 2024 10:08

DarkLight1337 requested a review from ywang96 as a code owner December 14, 2024 10:08

DarkLight1337 commented Dec 14, 2024

View reviewed changes

vllm/multimodal/processing.py Outdated Show resolved Hide resolved

Add a cache

0f944f1

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 changed the title ~~[VLM] Enable fully dynamic prompt replacement~~ [VLM] Enable fully dynamic prompt replacement sequence Dec 14, 2024

DarkLight1337 commented Dec 14, 2024

View reviewed changes

DarkLight1337 changed the title ~~[VLM] Enable fully dynamic prompt replacement sequence~~ [VLM] Enable fully dynamic prompt replacement sequence in merged input processor Dec 14, 2024

This was referenced Dec 14, 2024

[RFC]: Multi-modality Support on vLLM #4194

Open

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

DarkLight1337 changed the title ~~[VLM] Enable fully dynamic prompt replacement sequence in merged input processor~~ [VLM] Fully dynamic prompt replacement sequence in merged input processor Dec 14, 2024

DarkLight1337 added 2 commits December 14, 2024 12:56

Remove unused code

2063fae

Signed-off-by: DarkLight1337 <[email protected]>

Fix insufficient length

485ac8c

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py approved these changes Dec 14, 2024

View reviewed changes

tests/multimodal/test_processing.py Outdated Show resolved Hide resolved

vllm/model_executor/models/llava.py Outdated Show resolved Hide resolved

Isotr0py reviewed Dec 14, 2024

View reviewed changes

vllm/model_executor/models/phi3v.py Outdated Show resolved Hide resolved

DarkLight1337 added 2 commits December 14, 2024 15:05

Fix Phi3V

3bebc09

Signed-off-by: DarkLight1337 <[email protected]>

Fix

24b2b38

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py reviewed Dec 14, 2024

View reviewed changes

vllm/model_executor/models/phi3v.py Outdated Show resolved Hide resolved

vllm/model_executor/models/phi3v.py Show resolved Hide resolved

DarkLight1337 force-pushed the dynamic-repl-unit branch from 8cf2512 to 7ea8e20 Compare December 14, 2024 15:22

Fix processor init

096633e

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the dynamic-repl-unit branch from 7ea8e20 to 096633e Compare December 14, 2024 15:26

DarkLight1337 added 5 commits December 14, 2024 15:29

Use official processor code

53b50c4

Signed-off-by: DarkLight1337 <[email protected]>

mypy

bfa0d85

Signed-off-by: DarkLight1337 <[email protected]>

Fix kwargs not respected in InputContext

2cbf4fa

Signed-off-by: DarkLight1337 <[email protected]>

Update tests

b40d649

Signed-off-by: DarkLight1337 <[email protected]>

Update docstring

292cec6

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 changed the title ~~[VLM] Fully dynamic prompt replacement sequence in merged input processor~~ [VLM] Fully dynamic prompt replacement in merged input processor Dec 14, 2024

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 14, 2024

DarkLight1337 enabled auto-merge (squash) December 14, 2024 15:57

Add a note

d336396

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the dynamic-repl-unit branch 4 times, most recently from 1b3b22d to 653424e Compare December 14, 2024 16:07

Warn instead of hard failing

5285257

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the dynamic-repl-unit branch from 653424e to 5285257 Compare December 14, 2024 16:09

DarkLight1337 merged commit 93abf23 into vllm-project:main Dec 14, 2024
57 checks passed

DarkLight1337 deleted the dynamic-repl-unit branch December 14, 2024 17:54

mgoin reviewed Dec 14, 2024

View reviewed changes

DarkLight1337 mentioned this pull request Dec 15, 2024

[Misc] Clean up multi-modal processor #11207

Merged

Isotr0py mentioned this pull request Dec 17, 2024

[Bugfix] Fix broken phi3-v mm_processor_kwargs tests #11263

Merged

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024

[VLM] Fully dynamic prompt replacement in merged input processor (vll…

fe97c18

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]>

joennlae pushed a commit to 44ai-labs/vllm that referenced this pull request Jan 19, 2025

[VLM] Fully dynamic prompt replacement in merged input processor (vll…

904859a

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]>

abmfy pushed a commit to abmfy/vllm-flashinfer that referenced this pull request Jan 24, 2025

[VLM] Fully dynamic prompt replacement in merged input processor (vll…

ac92361

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Bowen Wang <[email protected]>

abmfy pushed a commit to abmfy/vllm-flashinfer that referenced this pull request Jan 24, 2025

[VLM] Fully dynamic prompt replacement in merged input processor (vll…

f540fd2

…m-project#11199) Signed-off-by: DarkLight1337 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLM] Fully dynamic prompt replacement in merged input processor #11199

[VLM] Fully dynamic prompt replacement in merged input processor #11199

DarkLight1337 commented Dec 14, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 14, 2024

DarkLight1337 Dec 14, 2024 •

edited

Loading

Isotr0py left a comment

DarkLight1337 commented Dec 14, 2024 •

edited

Loading

mgoin left a comment

[VLM] Fully dynamic prompt replacement in merged input processor #11199

[VLM] Fully dynamic prompt replacement in merged input processor #11199

Conversation

DarkLight1337 commented Dec 14, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 14, 2024

DarkLight1337 Dec 14, 2024 • edited Loading

Choose a reason for hiding this comment

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Dec 14, 2024 • edited Loading

mgoin left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Dec 14, 2024 •

edited by github-actions bot

Loading

DarkLight1337 Dec 14, 2024 •

edited

Loading

DarkLight1337 commented Dec 14, 2024 •

edited

Loading