Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import name 'PipelineOptions' from 'docling.datamodel.base_models' #213

Closed
abedkhooli opened this issue Nov 3, 2024 · 2 comments · Fixed by #226
Closed

Cannot import name 'PipelineOptions' from 'docling.datamodel.base_models' #213

abedkhooli opened this issue Nov 3, 2024 · 2 comments · Fixed by #226
Assignees

Comments

@abedkhooli
Copy link

I tried the example from Installation (Alternative OCR Engines) and got
cannot import name 'PipelineOptions' from 'docling.datamodel.base_models'
Then I tried https://ds4sd.github.io/docling/examples/custom_convert/ (PyPdfium with EasyOCR)
and got
ValueError: "PipelineOptions" object has no field "do_ocr"

Is there an updated end2end example of running pdf conversion with EasyOCR?

@dolfim-ibm dolfim-ibm self-assigned this Nov 4, 2024
@dolfim-ibm
Copy link
Contributor

Thanks for the report, we will post a fix soon.

In short, what you need is something like this

from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.backend.pypdfium2_backend import PyPdfiumDocumentBackend
from docling.document_converter import DocumentConverter, PdfFormatOption

pipeline_options = PdfPipelineOptions()
pipeline_options.do_ocr=True
pipeline_options.do_table_structure=True
pipeline_options.table_structure_options.do_cell_matching = True

doc_converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options, backend=PyPdfiumDocumentBackend)
    }
)

@abedkhooli
Copy link
Author

Thanks. That worked - at least on the logic side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants