-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using .DOCX format in cloud - suggestion on the below error? #410
Comments
@acsankar I want to help you here, but I think we need a bit more context. Can you give us the full stacktrace? |
Thanks for the reply. The whole stack is below. <tempfile._TemporaryFileWrapper object at 0x7bfe9c3b00d0>
|
I think the most likely problem here is that word file includes an image, blob of which can't be loaded by PIL library. @acsankar, any chance you could make an example file for this error? |
Draft of the PR that should fix the error: #432 |
@acsankar Fresh release of Docling 2.7.1, includes fixes! |
This error also occurs in PPTX, and I am currently using this version:
The error message is:
I believe you might need to make such considerations for files of all formats.🤣 |
I am trying to use this in cloud and just trying to convert it to markdown without images. Assuming below error is coming when there are images in document. Any suggestions to fix this?
doc_converter = DocumentConverter(
allowed_formats=[InputFormat.DOCX],
format_options={
InputFormat.DOCX: WordFormatOption(pipeline_cls=SimplePipeline),
},
)
I am getting below error
---> 30 result = doc_converter.convert(temp_file.name)
18 frames
/usr/local/lib/python3.10/dist-packages/PIL/ImageFile.py in load(self)
375 if loader is None:
376 msg = f"cannot find loader for this {self.format} file"
--> 377 raise OSError(msg)
378 image = loader.load(self)
379 assert image is not None
OSError: cannot find loader for this WMF file
The text was updated successfully, but these errors were encountered: