Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider migrating to Cryptography #1998

Closed
scdub opened this issue Jul 21, 2023 · 18 comments · Fixed by #2000
Closed

Consider migrating to Cryptography #1998

scdub opened this issue Jul 21, 2023 · 18 comments · Fixed by #2000
Assignees
Labels
nf-security Non-functional change: Security

Comments

@scdub
Copy link

scdub commented Jul 21, 2023

Explanation

I help maintain a large Python distribution, which would like to use pypdf for reading encrypted PDFs, but would prefer not to include the additional dependency of PyCryptodome as we already include Cryptography for numerous other dependencies. Cryptography includes the same cryptographic primitives used in pypdf (AES and RC4), but is typically a smaller and more secure installation as it calls well vetted implementations rather than implementing them directly. It also is a dependency of requests via urllib3, and is widely deployed. A quick check of conda-forge packages showed that PyCryptodome / PyCryptodomeEx was used in 25 packages versus 135 for Cryptography, and other packages such the pdfminer.six have made this migration earlier.

If this is something that seems worthwhile, I can work on creating a PR for this effort.

cc @exiledkingcc

@michelcrypt4d4mus
Copy link

+1 the pycryptodome dependency occasionally trips up pip

@exiledkingcc
Copy link
Contributor

good suggestion! i will try to add support for Cryptography. whats your opinion? @MartinThoma

@pubpub-zz
Copy link
Collaborator

for me,
@exiledkingcc, you are the one dealing the best with the crypto part of pypdf and who will do the job 😁
I just have a remark : currently pypdf is compatible with python down to 3.6 but appearantly cryptography is limited to 3.7+.
python 3.6 is still the default for RH8 (still active)
Personally I have no religion 🙂

@exiledkingcc
Copy link
Contributor

sure, personally i prefer not to replace PyCryptodome by cryptography, add just support using cryptography to do encryptioin/decryption when no PyCryptodome installed. i just write a PR #2000 for it.
and, it's flexible to support any other crypto libraries if needed.

@MartinThoma MartinThoma added the nf-security Non-functional change: Security label Jul 23, 2023
@MartinThoma
Copy link
Member

Thank you @scdub for pointing this out. It helps me a lot to know that requests uses cryptography. I need to check the cryptography project a little bit more, bit I tend to agree that this is a good idea.

@exiledkingcc Thank you for providing a PR that swift ❤️ I guess it will take at least this week. I don't want to rush with security-related changes

@MartinThoma
Copy link
Member

I help maintain a large Python distribution

@scdub Now I'm curious 😄 In case you don't want to elaborate publicly, but are free to speak about it in private: [email protected]

@MartinThoma
Copy link
Member

personally i prefer not to replace PyCryptodome by cryptography, add just support using cryptography to do encryptioin/decryption when no PyCryptodome installed

As I don't want to break backwards compatibility, I also prefer that approach 👍 Was that also the (only) reason why you don't want to change it? If that is the case, we could plan deprecating PyCryptodome for pypdf>=4.0.0

@MartinThoma
Copy link
Member

Regarding the support of Python versions: #2005 - currently, I tend to keep Python 3.6 support. I feel like it doesn't cost us that much + a lot of people (sadly) still need it.

@exiledkingcc
Copy link
Contributor

personally, i would prefer python standard libraries to have cryptographic functions, which not happens yet 😞 .
for now i think we can keep both as well as they are in good maintation.

MartinThoma pushed a commit that referenced this issue Jul 27, 2023
…yCryptodome (#2000)

Use the [`cryptography` package](https://pypi.org/project/cryptography/) (maintained by the Python Cryptographic Authority, pyca) as a fallback if [PyCryptodome](https://pypi.org/project/pycryptodome/) (maintained by Legrandin / Helder Eijs) is not installed.

Closes #1998
@MartinThoma
Copy link
Member

@scdub I've merged the excellent PR by @exiledkingcc . The cryptography fallback will be part of pypdf > 3.13.0 which I will release latest on Sunday (30th of July).

@MartinThoma
Copy link
Member

We will likely also switch the order of preferences; recommending rather cryptography instead of PyCryptodome (and using cryptography if both are installed - but I'm uncertain when that will happen).

Thank you for mentioning this issue / bringing cryptography as an option to my attention 🙏 I value such contributions; if you want I'll add you as a contributor: https://pypdf.readthedocs.io/en/latest/meta/CONTRIBUTORS.html

MartinThoma pushed a commit that referenced this issue Aug 2, 2023
Comparison of the cryptography and PyCryptoDome libraries:

|                | cryptography | pycryptodome    |
| -------------- | ------------ | --------------- |
| Maintainers    | pyca         | a single person |
| PyPI downloads | 156,673,182  | 19,755,432      |
| Github Stars   | 5.7k         | 2.4k            |
| Github Forks   | 1.6k         | 0.4k            |

Hence we now primarily use cryptography and use `PyCryptodome` as a fallback

See #1998
@scdub
Copy link
Author

scdub commented Aug 9, 2023

I help maintain a large Python distribution

@scdub Now I'm curious 😄 In case you don't want to elaborate publicly, but are free to speak about it in private: [email protected]

Thanks much for this change! I really appreciate it. The Python distribution I had mentioned is what we distribute as part of ArcGIS Pro, a Geographic Information System program. We use your great pypdf package (and earlier PyPDF2) to allow users to create PDFs of their maps from Python.

@pubpub-zz
Copy link
Collaborator

@scdub
Thanks for your comments appreciated.
In order to support pypdf dev, Can you give your opinion about supporting(#1660)? the idea is really to have some capability to buy documentation and other support means to extend pypdf.

@scdub
Copy link
Author

scdub commented Aug 9, 2023

@pubpub-zz For sure, I can contribute on a personal basis, and left some thoughts on approaches to support on that discussion.

@pubpub-zz
Copy link
Collaborator

@MartinThoma you will have to find the good organisation 😀

@scdub
Copy link
Author

scdub commented Aug 9, 2023

@pubpub-zz specifically on the PDF 2.0 specification, have you seen https://www.pdfa.org/sponsored-standards/ ? It still requires accepting a EULA, but does provide free access to ISO 32000-2 and some of its supplements.

@pubpub-zz
Copy link
Collaborator

@pubpub-zz specifically on the PDF 2.0 specification, have you seen https://www.pdfa.org/sponsored-standards/ ? It still requires accepting a EULA, but does provide free access to ISO 32000-2 and some of its supplements.

Thanks, already loaded.😉

@MartinThoma
Copy link
Member

The PDF 2.0 specifications became available after I've opened #1660

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nf-security Non-functional change: Security
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants