-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fitz freezes on some PDFs when calling the fitz.Page.get_text_blocks method. #2548
Comments
The problem occurs when extracting text in general. It more precisely is loop. |
Thanks! |
Sorry for the delay, here is the MuPDF bug report: https://bugs.ghostscript.com/show_bug.cgi?id=707074. |
The problem has been fixed in MuPDF. |
Fixed test failure in rebased. We were not converting fz exception into C++ exception in src/extra.i:page_get_textpage(). Also fixed other cases where we leaked fz exception.
Fixed test failure in rebased. We were not converting fz exception into C++ exception in src/extra.i:page_get_textpage(). Also fixed other cases where we leaked fz exception.
Fixed in 1.23.5. |
Describe the bug (mandatory)
Fitz freezes on some PDFs when calling the fitz.Page.get_text_blocks method.
To Reproduce (mandatory)
Download the pdf that causes a freeze.
original https://aacr.figshare.com/articles/journal_contribution/Supplementary_Data_from_Targeting_Therapeutic_Resistance_and_Multinucleate_Giant_Cells_in_CCNE1-Amplified_HR-Proficient_Ovarian_Cancer/22523824/1/files/39986620.pdf
mirror https://www.dropbox.com/s/s7zjp7a8ys5ibh0/mct-21-0873_supplementary_data_s1_supps1.pdf?dl=0
Run the python code
The program will print
and then freeze.
Additional context (optional)
Reproduces on PyMuPDF==1.22.3 and PyMuPDF==1.22.5. Reproduces on macOS 12.6.5 and Ubuntu 20.04.2
The text was updated successfully, but these errors were encountered: