-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursion error when using clone_from
of PdfWriter on PDF 2.0 specification
#2839
Comments
I have the same behavior in windows/python 3.10.5 however when upgrading to 3.13 (standard) the file can be loaded successfully setting recursionlimit to 5000 (on python 3.10 there is "crash" with stack overflow) |
@stefan6419846 |
I do not think we should convert this into a discussion, as this surely is some bug/limitation. Is there any reason why this would not fail for the reader, but for the writer? In any case, I recommend documenting the reason for this inside our docs and propose possible workarounds, like increasing the recursion limit (with an example) or splitting large documents beforehand. |
Yes : the objects are only read/loaded/cached into memory when required. in the current design The PdfWriter sucks/clones the root object and all linked objects recursively.
then I would propose to add in the document: |
allow to load hudge files closes py-pdf#2839
Environment
The version effectively is the latest main code.
Code + PDF
This is a minimal, complete example that shows the issue:
Using
PdfReader
and iterating over the pages extracting the text does not fail.I cannot share the document (1003 pages) here as it is the non-public copy of the PDF 2.0 specification available for free on https://pdfa.org/sponsored-standards/
Traceback
This is the complete traceback I see:
The text was updated successfully, but these errors were encountered: