-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suppressing warnings and other messages that gets printed on stdout by PyMuPDF #209
Comments
These messages are issued directly by the underlying C library MuPDF, not by the wrapper code (i.e. me, PyMuPDF). Currently, I am working on the new v1.14.0. I will check again whether they are now providing a proper circumvention. So please check again once I publish PyMuPDF 1.14.0 in the next few weeks. |
@JorjMcKie Thanks for your quick response. I was aware these messages are issued by MuPDF but I thought you may know a workaround for this problem. I really appreciate your efforts. |
I am close to a clean compile of PyMuPDF v1.14.0 -- another day or so. What I already know w/r to this issue: |
Did some more research with the new v1.14.0. |
Is it possible to create an issue for MuPDF developers to suppress this kind of message? It's annoying to have a lot of unwanted lines on stdout. |
Well, we could do that. But I'm afraid they have lots of other things to do, so they very probably will treat this as nice-to-have or, worse, as mannerism. As I indicated above, there certainly is a brute-force alternative: replacing the MuPDF error module ( In addition, this approach would not get rid of every single direct writing to system stderr: there are about 100 other places, where MuPDF directly outputs to stderr and does not make use of their own However: |
I have been experimenting a bit: In this version I am redirecting MuPDF warnings and many errors to Please do try and tell me what you think! |
@cquark7 - forgot to mention you, sorry. |
You have talked me into making some changes to MuPDF error / warning message handling ... both of you. I am intercepting MuPDF's Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:30:26) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import fitz
>>> doc = fitz.open("acronis.xps") # some XPS document
>>> fitz.TOOLS.fitz_stderr # message store is still empty
u''
>>> pdfbytes = doc.convertToPDF() # convert XPS to PDF
>>> fitz.TOOLS.fitz_stderr # and look at the message store:
u'warning: freetype getting character advance: invalid glyph index\n'
>>> fitz.TOOLS.fitz_stderr_reset() # empty the message store
>>> fitz.TOOLS.fitz_stderr # and prove it
u''
>>> doc.close() # try another document: SVG this time
>>> doc = fitz.open("acronis.svg")
>>> fitz.TOOLS.fitz_stderr # still no complaints?
u''
>>> pdfbytes = doc.convertToPDF() # convert that one too
>>> fitz.TOOLS.fitz_stderr # and see what would have gone to system STDERR
u'warning: ... repeated 3 times ...\nwarning: push viewport: 0 0 594.75 841.5\nwarning: push viewbox: 0 0 594.75 841.5\nwarning: push viewport: 0 0 594.75 841.5\nwarning: ... repeated 2 times ...\nwarning: push viewport: 0 0 980 71\nwarning: push viewport: 0 0 594.75 841.5\nwarning: ... repeated 2512 times ...\nwarning: push viewport: 0 0 112 33\nwarning: push viewport: 0 0 594.75 841.5\nwarning: ... repeated 2 times ...\nwarning: push viewport: 0 0 181 120\nwarning: push viewport: 0 0 94 54\nwarning: ... repeated 2 times ...\nwarning: push viewport: 0 0 130 88\nwarning: ... repeated 2 times ...\nwarning: push viewport: 0 0 181 115\nwarning: push viewport: 0 0 594.75 841.5\n'
>>> I think this is the best achievable solution. As I said in previous posts:
I will be generating the wheels within the next our or so to https://github.com/JorjMcKie/PyMuPDF-wheels. |
Just uploaded the new v1.14.0 which implements the issue resolution. |
@JorjMcKie Thank you very much for this update! It's working as expected. :) I've tested with 3 different PDFs (each one outputs different warnings). The test code is: # testwarning.py
import sys
import fitz
def pdf_to_text(filename):
doc = fitz.open(filename, filetype="pdf")
text = []
for page_number in range(doc.pageCount):
page = doc.loadPage(page_number)
page_text = '\n'.join(block[4] for block in page.getTextBlocks())
text.append(page_text)
return '\n'.join(text)
pdf_to_text(sys.argv[1]) Output with PyMuPDF==1.13.20:
Output with PyMuPDF-1.14.0-cp37-cp37m-manylinux1_x86_64.whl (it's not available on PyPI) - no output expected:
The PDFs are available for download, if you'd like to test: Could you please upload this new version to PyPI? Thanks again! |
pleased to hear that! |
Hi! I'm getting "mupdf: invalid page object" printed to the console when opening pdf's. Are these among the "errors" rather than warnings that have been kept as is? Is it possible to reroute them to fitz_stderr instead? |
No, this is an error, not a warning. In broken PDFs this may happen, when a dictionary object does not conform to a page dictionary. You might however still be able to work with the PDF, but consequential other errors may occur. It is usually still possible to extract e.g. images or fonts if looping over the xrefs (and not the pages).
Yes, there is an option to switch off or on MuPDF error message output via |
gives me currently on PyMuPDF v 1.16.11 |
Weird:
or
|
The method was new in v1.16.8 |
solved, pardon my versioning ignorance :) |
Nothing to forgive - everything fine. |
PuMuPDF prints a lot of warnings and error messages on STDOUT while parsing PDF documents (especially while extracting images). I am looking for a way to suppress or redirect the messages that gets printed on STDOUT.
Example warnings/messages:
These messages are quite annoying and serve no purpose (at least for my use case). I get more than 100 warnings just for a single PDF file.
I tried the methods present here: How do I prevent a C shared library to print on stdout in python? but they are not working with PyMuPDF, so please suggest something.
The text was updated successfully, but these errors were encountered: