Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU RAM Requirements for Formula Detection (do_formula_enrichment) #871

Open
JPC612 opened this issue Feb 3, 2025 · 3 comments
Open

GPU RAM Requirements for Formula Detection (do_formula_enrichment) #871

JPC612 opened this issue Feb 3, 2025 · 3 comments
Labels
question Further information is requested

Comments

@JPC612
Copy link

JPC612 commented Feb 3, 2025

Hi,

I am using docling with an RTX 3090 and encountering a CUDA out-of-memory error when enabling do_formula_enrichment=True. Could you provide information on the expected GPU RAM usage for formula detection? How much memory is typically required to process documents with this setting enabled?

Thanks in advance!

@JPC612 JPC612 added the question Further information is requested label Feb 3, 2025
@Matteo-Omenetti
Copy link
Contributor

Hello,

The CodeFormula model can be quite heavy, and a default batch size of 16 may be too high for some hardware setups. We’re currently working on an update that will allow each model to use its own batch size, ensuring better adaptability for models of different sizes.

In the meantime, you can manually reduce the batch size by importing and modifying the settings object. Please note that this change applies to all models, not just CodeFormula:

from docling.datamodel.settings import settings

settings.perf.elements_batch_size = 2

I hope this helps!

@Matteo-Omenetti
Copy link
Contributor

I will update this issue as soon as the more fine-grained batch size selection feature is released.

@JPC612
Copy link
Author

JPC612 commented Feb 4, 2025

Thank you !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants