[Performance]: qwen2vl very slow when preprocess large image #9238

zhjunqin · 2024-10-10T08:30:06Z

Proposal to improve performance

No response

Report of performance regression

build with latest vllm code and start Qwen2-VL-7B-Instruct

It takes too long time to handle preprocess lead to heartbeat timeout.

ERROR 10-10 01:14:54 client.py:250] RuntimeError('Engine loop has died')
ERROR 10-10 01:14:54 client.py:250] Traceback (most recent call last):
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 150, in run_heartbeat_loop
ERROR 10-10 01:14:54 client.py:250] await self._check_success(
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 314, in _check_success
ERROR 10-10 01:14:54 client.py:250] raise response
ERROR 10-10 01:14:54 client.py:250] RuntimeError: Engine loop has died

ERROR 10-10 01:25:08 client.py:250] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-10 01:25:08 client.py:250] NoneType: None
DEBUG 10-10 01:25:08 client.py:144] Shutting down MQLLMEngineClient check health loop due to timeout
DEBUG 10-10 01:25:14 client.py:170] Waiting for output from MQLLMEngine.
CRITICAL 10-10 01:25:14 launcher.py:99] MQLLMEngine is already dead, terminating server process

Any suggestion to help improve preprocess preformance?

Misc discussion on performance

No response

Your current environment (if you think it is necessary)

The output of `python collect_env.py`

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

zhjunqin · 2024-10-10T08:36:21Z

code in transformers
https://github.com/huggingface/transformers/blob/main/src/transformers/image_transforms.py#L97

def rescale(
    image: np.ndarray,
    scale: float,
    data_format: Optional[ChannelDimension] = None,
    dtype: np.dtype = np.float32,
    input_data_format: Optional[Union[str, ChannelDimension]] = None,
) -> np.ndarray:
    """
    Rescales `image` by `scale`.

    Args:
        image (`np.ndarray`):
            The image to rescale.
        scale (`float`):
            The scale to use for rescaling the image.
        data_format (`ChannelDimension`, *optional*):
            The channel dimension format of the image. If not provided, it will be the same as the input image.
        dtype (`np.dtype`, *optional*, defaults to `np.float32`):
            The dtype of the output image. Defaults to `np.float32`. Used for backwards compatibility with feature
            extractors.
        input_data_format (`ChannelDimension`, *optional*):
            The channel dimension format of the input image. If not provided, it will be inferred from the input image.

    Returns:
        `np.ndarray`: The rescaled image.
    """
    if not isinstance(image, np.ndarray):
        raise TypeError(f"Input image must be of type np.ndarray, got {type(image)}")

    rescaled_image = image * scale  # take long time for large image
    if data_format is not None:
        rescaled_image = to_channel_dimension_format(rescaled_image, data_format, input_data_format)

    rescaled_image = rescaled_image.astype(dtype)

    return rescaled_image

rescaled_image = image * scale # takes long time for large image

DarkLight1337 · 2024-10-10T08:47:45Z

It is recommended to use qwen_vl_utils (as shown here) to preprocess the images before passing them into vLLM.

zhjunqin · 2024-10-10T09:38:19Z

It is recommended to use qwen_vl_utils (as shown here) to preprocess the images before passing them into vLLM.

I'm not sure how it work. After preprocess by qwen_vl_utils, image becomes to a tensor, then send to vLLM？

DarkLight1337 · 2024-10-10T09:50:59Z

It should resize the images to be suitable to be used by the model. So, if you input an image that is too large, it should be resized.

zhjunqin · 2024-10-10T11:12:31Z

It should resize the images to be suitable to be used by the model. So, if you input an image that is too large, it should be resized.

Take 1080p (1980 x 1080) as limit picture size, it still takes more than 10s to preprocess.

DarkLight1337 · 2024-10-10T17:10:51Z

Since the processing time is contained inside HuggingFace, I guess it's not really our fault... even if our engine didn't timeout, you would still get very poor performance. Perhaps open an issue on HuggingFace side? cc @fyabc

zhjunqin added the performance Performance-related issues label Oct 10, 2024

zhjunqin mentioned this issue Oct 21, 2024

image_transforms preprocess quite slow when run large image with qwen2vl huggingface/transformers#34272

Closed

4 tasks

DarkLight1337 mentioned this issue Oct 28, 2024

[Bug]: Qwen2-VL incoherent output with OpenAI API #9732

Closed

SinanAkkoyun mentioned this issue Oct 28, 2024

[bug] (duplicate) big images take way too long to process QwenLM/Qwen2.5-VL#491

Closed

joerunde mentioned this issue Oct 29, 2024

[Bugfix][core] replace heartbeat with pid check #9818

Merged

noooop mentioned this issue Oct 30, 2024

[Performance]: InternVL multi image speed is not improved compare to original #9483

Open

1 task

simon-mo closed this as completed in #9818 Oct 30, 2024

This was referenced Oct 31, 2024

[Performance]: Qwen2-VL-7B AWQ model performance #9863

Open

[Performance]: FP8 performance worse than FP16 for Qwen2-VL-2B-Instruct #9992

Open

This was referenced Nov 7, 2024

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

[Usage]: Engine iteration timed out. (during using qwen2-vl-7b) #10123

Open

DarkLight1337 mentioned this issue Jan 6, 2025

[Feature]: Qwen2-VL-72B-Instruct-GPTQ-Int4 model runs very slowly in A100 machine 80GB #11767

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance]: qwen2vl very slow when preprocess large image #9238

[Performance]: qwen2vl very slow when preprocess large image #9238

zhjunqin commented Oct 10, 2024

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024 •

edited

Loading

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

[Performance]: qwen2vl very slow when preprocess large image #9238

[Performance]: qwen2vl very slow when preprocess large image #9238

Comments

zhjunqin commented Oct 10, 2024

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024 • edited Loading

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

zhjunqin commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024 •

edited

Loading