-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pdf.getDocumentInfo().title sometimes None #511
Comments
@MartinThoma is this now resolved? Or can't reproduce as the other project appears to have been removed? I might have the original bug script around if that would be helpful. |
#13 is the issue you've linked to. We moved PyPDF2 from mstamy2 -> me (MartinThoma) -> py-pdf (a Github Organization) this week. As PyPDF2 has been inactive for a long time, I need to clean up a lot of things (PRs, issues, code, documentation, ...). I'm sorry that I didn't add more details. Could you please check if the issue still persists? If so, please create a MCVE (including a PDF): https://stackoverflow.com/help/minimal-reproducible-example . Then I'll re-open :-) |
That's great news @MartinThoma! Thanks for the quick status update. How about any future issues get closed with a link to #657 as an explanation? Back to this specific issue, this issue has nothing to do with #13, its specifically about the linked issue to a repo that no longer exists. I have the test code in claird#59 but the test pdf is only in the deleted repo (it was about 5Mb). @mstamy2 is the original owner, I noticed he responded to #657 so pinging him in case the repo is simply private rather than deleted as that's where I put all my notes. @mstamy2 do you still have access to the https://github.com/mstamy2/PyPDF3/issues/13? I can try and check an old hard drive to see if I still have it but I won't have access to it for a while :-( |
New test case created based on original :-) OverviewI've seen a number of PDF files where the Attached PDF is about 5Mb and is a sample of a document that exhibits this behavior, I did not create it (nor do I know how it was created) so the only information we have is the metadata inside. Test case, along with workaround below: Test PDF fileTest caseEDIT inline version dumps PyPDF version (attached version does not). inline and attached (rename to .py)
outputPython 2
Python 3
|
@MartinThoma hopefully this helps. I really know nothing about PDF internals which is why I've not attempted a fix :-( I have a workaround that seems to be effective but not sure if it is reasonable. If you need anything else from me on this please ping me (I recall I had other PDFs with similar behaviors, this was one of the smaller ones). Thanks for picking up the torch on this and trying to organize collaboration |
Thank you very much! I now hope that somebody will pick it up and dig into it :-) |
updated test case with PyPDF2 version |
Located another PDF, appears to be created with the same PDF generator. Adobe Reader reports; PDF Version 1.3 (Acrobat 4.x). InfoKey: Producer, InfoValue: macOS Version 10.10 Quartz PDFContext This file is much smaller than the attached test PDF. I also found some Microsoft Word generated PDFs but when I attempted to export/create PDFs from recent Word, the title worked fine. |
This is not correct, I've copy/pasted the one email I have where someone posted to the issue I created (note issue, not a PR - in a completely different repo):
|
Handle case when title really is None
There is something really weird about that PDF:
|
Test case in #511 (comment) (original no longer available test case Details in https://github.com/mstamy2/PyPDF3/issues/13).
The text was updated successfully, but these errors were encountered: