-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyMuPDF logging prints to stdout #700
Comments
There's something that does not make sense to me: how come you're hitting this issue, given that the offending function is in the Line 64 in 93bf0af
Can you give a bit more info about the environment you are seeing this at? |
Also, I've opened an issue in the PyMuPDF repo: pymupdf/PyMuPDF#3135 |
I was manually looking for the I am running this on Fedora 38 on Qubes. I'll try another environment to see if there are any differences. |
And thanks for filing the upstream issue! |
Curious that this happened, especially in a point release. Here they have explained that:
But if there isn't even a major version change how are users supposed know to check that there is a migration in progress? When users notice things is when things break, which is what happened here... Per discussion with @apyrgio we'll be trying to stick with |
The following may be a working solution: diff --git a/Dockerfile b/Dockerfile
index 4b794b2d..b7e53091 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -61,7 +61,8 @@ RUN apk --no-cache -U upgrade && \
py3-magic \
font-noto-cjk
-COPY --from=pymupdf-build /usr/lib/python3.11/site-packages/fitz/ /usr/lib/python3.11/site-packages/fitz
+COPY --from=pymupdf-build /usr/lib/python3.11/site-packages/fitz /usr/lib/python3.11/site-packages/fitz
+COPY --from=pymupdf-build /usr/lib/python3.11/site-packages/fitz_old /usr/lib/python3.11/site-packages/fitz_old
COPY --from=tessdata-dl /usr/share/tessdata/ /usr/share/tessdata
COPY --from=h2orestart-dl /libreoffice_ext/ /libreoffice_ext
diff --git a/dangerzone/conversion/doc_to_pixels.py b/dangerzone/conversion/doc_to_pixels.py
index 5f1abf3b..2bce0cf2 100644
--- a/dangerzone/conversion/doc_to_pixels.py
+++ b/dangerzone/conversion/doc_to_pixels.py
@@ -7,7 +7,10 @@ import shutil
import sys
from typing import Dict, List, Optional, TextIO
-import fitz
+try:
+ import fitz_old as fitz
+except:
+ import fitz
import magic
from . import errors
diff --git a/dangerzone/conversion/pixels_to_pdf.py b/dangerzone/conversion/pixels_to_pdf.py
index 0243858d..e134680f 100644
--- a/dangerzone/conversion/pixels_to_pdf.py
+++ b/dangerzone/conversion/pixels_to_pdf.py
@@ -25,7 +25,10 @@ class PixelsToPDF(DangerzoneConverter):
tempdir = "/safezone"
# XXX lazy loading of fitz module to avoid import issues on non-Qubes systems
- import fitz
+ try:
+ import fitz_old as fitz
+ except:
+ import fitz
num_pages = len(glob.glob(f"{tempdir}/pixels/page-*.rgb"))
total_size = 0.0 Note: do need to keep both @apyrgio what do you think? This solution works both in containers as well as in Qubes and it only adds 22MB to the container image (when compressed). |
There was another suggestion, of pinning PyMuPDF to 1.23.8 (now currently at 1.23.21), at least for this release. I looked into the PyMuPDF source, and I don't see any significant code changes in the |
PyMuPDF 1.23.9 made the swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stderror for sending JSON from the container. Pinning the version seems to have no adverse consequences, since fitz_old hasn't had significant changes and it gives breething room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
PyMuPDF 1.23.9 made the swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stderror for sending JSON from the container. Pinning the version seems to have no adverse consequences [1], since fitz_old hasn't had significant changes and it gives breething room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
PyMuPDF 1.23.9 swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stdout for sending JSON from the container. Pinning the version seems to have no adverse consequences [1], since fitz_old hasn't had significant changes and it gives breathing room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
PyMuPDF has some hardcoded log messages that print to stdout [1]. We don't have a way to silence them, because they don't use the Python logging infrastructure. What we can do here is silence a particular call that's been creating debug messages. For a long term solution, we have sent a PR to the PyMuPDF team, and we will follow up there [2]. Fixes #700 [1]: #700 [2]: pymupdf/PyMuPDF#3137
Unpin the PyMuPDF dependency, now that we have a way to silence its debug logs that have been added in its new `fitz` implementation. Refs #700
Unpin the PyMuPDF dependency, now that we have a way to silence its debug logs that have been added in its new `fitz` implementation. Refs #700
While converting a document on the page streaming PR I have just noticed the following:
Here's the line in question. It turns out that PyMuPDF's logging simply prints the error. Fortunately I haven't seen this in the pixels_to_PDF, but if it does happen it'll interfere with out page streaming since it assumes all stdout is pixel data.
The text was updated successfully, but these errors were encountered: