Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve image processing time #33810

Open
yonigozlan opened this issue Sep 30, 2024 · 3 comments
Open

Improve image processing time #33810

yonigozlan opened this issue Sep 30, 2024 · 3 comments

Comments

@yonigozlan
Copy link
Member

Feature request

Optimize Transformers' image_processors to decrease image processing time, and reduce inference latency for vision models and vlms.

Motivation

The Transformers library relies on PIL (Pillow) for image preprocessing, which can become a major bottleneck during inference, especially with compiled models where the preprocessing time can dominate the overall inference time.

image
image-1

In the examples above, the RT-DETR preprocessing necessitates only to resize the image, while the DETR one involves resize+normalize.
In eager mode, image preprocessing takes a big part of the total inference time for RT-DETR, but is not the main bottleneck. However, with a compiled RT-DETR, image preprocessing takes up the majority of the inference time, underlining the necessity to optimize it. This is even clearer for DETR, where image preprocessing is already the main bottleneck in eager mode.

However, alternative libraries exist that leverage available hardware more efficiently for faster image preprocessing.
OptimVision uses such libraries to get much better results compared to Transformers.

Much more details on OptimVision and image processing methods comparison are available on this Notion page.

Your contribution

OptimVision is an experiment playground to optimize the different steps involved in inferring/training with vision models.
The current fast image preprocessing in OptimVision is a proof of concept and is not yet ready to be merged into Transformers, but that this the ultimate goal :).

@LysandreJik
Copy link
Member

Sounds like a good project indeed :)

@Gladiator07
Copy link
Contributor

Hi, any updates on making Qwen2VLProcessorFast, as we are experiencing a major bottleneck because of the preprocessing time in offline mode

@yonigozlan
Copy link
Member Author

Answered here #34272 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants