-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance]: qwen2vl very slow when preprocess large image #9238
Comments
code in transformers
rescaled_image = image * scale # takes long time for large image |
It is recommended to use |
I'm not sure how it work. After preprocess by qwen_vl_utils, image becomes to a tensor, then send to vLLM? |
It should resize the images to be suitable to be used by the model. So, if you input an image that is too large, it should be resized. |
Take 1080p (1980 x 1080) as limit picture size, it still takes more than 10s to preprocess. |
Since the processing time is contained inside HuggingFace, I guess it's not really our fault... even if our engine didn't timeout, you would still get very poor performance. Perhaps open an issue on HuggingFace side? cc @fyabc |
Proposal to improve performance
No response
Report of performance regression
build with latest vllm code and start Qwen2-VL-7B-Instruct
It takes too long time to handle preprocess lead to heartbeat timeout.
ERROR 10-10 01:14:54 client.py:250] RuntimeError('Engine loop has died')
ERROR 10-10 01:14:54 client.py:250] Traceback (most recent call last):
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 150, in run_heartbeat_loop
ERROR 10-10 01:14:54 client.py:250] await self._check_success(
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 314, in _check_success
ERROR 10-10 01:14:54 client.py:250] raise response
ERROR 10-10 01:14:54 client.py:250] RuntimeError: Engine loop has died
ERROR 10-10 01:25:08 client.py:250] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-10 01:25:08 client.py:250] NoneType: None
DEBUG 10-10 01:25:08 client.py:144] Shutting down MQLLMEngineClient check health loop due to timeout
DEBUG 10-10 01:25:14 client.py:170] Waiting for output from MQLLMEngine.
CRITICAL 10-10 01:25:14 launcher.py:99] MQLLMEngine is already dead, terminating server process
Any suggestion to help improve preprocess preformance?
Misc discussion on performance
No response
Your current environment (if you think it is necessary)
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: