Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 608 use null ID when encrypted but no ID given #610

Closed
wants to merge 2 commits into from
Closed

Fix 608 use null ID when encrypted but no ID given #610

wants to merge 2 commits into from

Conversation

richardmillson
Copy link
Contributor

Pull request

  • Fixes Issue Cannot decrypt PDF missing 'ID' in trailer #608 where a PDF with an '/Encrypt' key but no '/ID' key in
    self.trailer throws a KeyError and is not decrypted.
  • If no '/ID' key is present in self.trailer an array of two empty bytestrings
    is used in place of an '/ID'. This is how Apache PDFBox handles this case.
  • This makes PyPDF2 more robust to malformed PDFs that
    don't follow the spec.
  • As camelot
    uses this library, this provides a branch with working behavior to use.

How Has This Been Tested?

  • Added test against sample PDF.

@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF workflow-encryption From a users perspective, encryption is the affected feature/workflow labels Apr 6, 2022
@MartinThoma
Copy link
Member

Thank you for your PR! It already looks good, but it has merge conflicts. Would you mind resolving them?

If you don't want to do that, I would take care of it. I would give credit via Githubs co-authored-by feature.

@MartinThoma MartinThoma added soon PRs that are almost ready to be merged, issues that get solved pretty soon needs-change The PR/issue cannot be handled as issue and needs to be improved labels Apr 16, 2022
MartinThoma added a commit that referenced this pull request Apr 24, 2022
If no '/ID' key is present in self.trailer an array of two empty bytestrings is used in place of an '/ID'. This is how Apache PDFBox handles this case.

This makes PyPDF2 more robust to malformed PDFs.

Closes #608
Closes #610

Full credit for this one to Richard Millson - Martin Thoma only fixed a merge conflict

Co-authored-by: Richard Millson <[email protected]>
VictorCarlquist pushed a commit to VictorCarlquist/PyPDF2 that referenced this pull request Apr 29, 2022
If no '/ID' key is present in self.trailer an array of two empty bytestrings is used in place of an '/ID'. This is how Apache PDFBox handles this case.

This makes PyPDF2 more robust to malformed PDFs.

Closes py-pdf#608
Closes py-pdf#610

Full credit for this one to Richard Millson - Martin Thoma only fixed a merge conflict

Co-authored-by: Richard Millson <[email protected]>
@richardmillson
Copy link
Contributor Author

@MartinThoma Didn't see your message in time; thanks for resolving the conflict and merging though!

@richardmillson richardmillson deleted the fix-608-decryption-fails-when-no-ID-in-trailer branch May 3, 2022 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF needs-change The PR/issue cannot be handled as issue and needs to be improved soon PRs that are almost ready to be merged, issues that get solved pretty soon workflow-encryption From a users perspective, encryption is the affected feature/workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants