Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you publish a wheel? #13

Closed
matthewdeanmartin opened this issue May 5, 2022 · 7 comments
Closed

Can you publish a wheel? #13

matthewdeanmartin opened this issue May 5, 2022 · 7 comments

Comments

@matthewdeanmartin
Copy link

There are certain security benefits to publishing wheels

# mitigate supply chain risk by using --only-binary
pip install splitstream --only-binary=:all:

It looks like you're using setup.py, so if you add bdist_wheel
python setup.py build sdist bdist_wheel

and then twine upload the .whl file, that would do it.

Thanks for publishing splitstream, this is so useful I wonder why json libraries don't include something like this.

@rickardp
Copy link
Owner

rickardp commented May 7, 2022

happy to hear you like the library. I was myself surprised I had to write this when I did.

From the beginning, it was a conscious decision to stay away from building binaries. The build matrix quickly becomes huge with all the supported Python versions and platforms (just Linux is two popular libc versions: muslc and glibc, and then arm64 amd64, then macos (arm+x64), Windows etc etc.

That said, I believe that it wouldn’t be too complex to set up with Github actions and it would be nice to offer this so that consumers of this package wouldn’t have to include the tool chain.

So yes I agree this should be done, though I’d like to solve the full problem (building for all platforms, not only one or a few of the most popular ones).

Until this issue is fixed in Github, the build times would be horrible though
actions/runner-images#2552

It’s the first time I heard the motivation of security. To satisfy my curiosity, do you have some reference for this? (What I didn’t understand is why anyone would trust “my” compiler more that “your” compiler?)

@matthewdeanmartin
Copy link
Author

When I asked amazon ion to add wheels (they also have a lib with native code parts) they now have this Github Action- https://github.com/amzn/ion-python/blob/master/.github/workflows/release.yml I don't know enough native coding know-how to see if it works for splitstream.

Wheels were originally made so that package installers didn't have to compile the native code & the pip flags are all talking about binaries this and binaries that. But I'm less concerned about that, I was thinking about supply chain mitigations wrt python code run at point of install. Yes, philosophically speaking, there is a position that no code can be trusted and we should just assume all code is (what malicious? safe? I don't know). I'm not going to persuade anyone on that one way or the other.

sdist code executes the setup.py on pip install and that can include arbitrary code- so that code will execute on the build server where you can gain access to code signing keys, and other goodies (e.g. Solarwind). Wheels installation runs no code on install, except unzip. So malicious code would have to be invoked with the API, bad but for a malicious code distributor, execution on install is more valuable.

Where this attack vector is most useful is if I want to steal your ssh keys, but I don't think I can highjack a package like requests, but I can highjack a misspelling of requests (a typosquat). The user (or more likely build server) didn't have --only-binary=:all: so it runs some random malicious package with key stealers in the setup.py & I didn't even have to simulate requests & add malicious code as hooks.

Re your specific question- I trust your compiler, but should I not enable --only-binary=:all: and me or a Jr developer on the team types slipstrem then a typosquat package could be installed and it would steal keys or whatever. That is someone else's compiler who I don't trust. (I'll elide how pypi really doesn't have any identity features at all- the official AWS packages all look like someone claiming to be AWS)

A long-term solution would be to convince everyone to get rid of setup.py (use pyproject.toml instead) and always at least publish a wheel, but pypi is going to be stuck with legacy decisions forever.

Other people who've thought similar things:
https://www.bleepingcomputer.com/news/security/python-package-installation-can-trigger-malicious-code/
https://github.com/Ayrx/malicious-python-package/blob/master/setup.py

And the viability of typosquats
https://thehackernews.com/2021/07/several-malicious-typosquatted-python.html
https://nakedsecurity.sophos.com/2017/09/19/pypi-python-repository-hit-by-typosquatting-sneak-attack/

Anyhow, wheels also mean gcc doesn't need to be installed or available (a challenge I've seen with jr devs) & the installation is faster.

@rickardp
Copy link
Owner

rickardp commented May 8, 2022

Thank you for the excellent explanation and links. I did not know that binary installs did not execute package code.

I suppose this protects the build agent up til the point where unit tests are run ;) Anyway the package system is broken, I agree (for many reasons but in this context because it doesn’t allow you to reserve a prefix like in NPM or NuGet which makes typo squatting feasible - I know at least MS reviews prefix registrations so one would hope they reject malicious use). Putting your AWS deploy code in a separate job will protect you somewhat, I suppose.

(The ion library btw forgot about non-x64 so it means all devs on M1 macs (half of our team) and those building for the RPi or AWS Graviton are out of luck)

I will find some time within the next weeks or so to evaluate a build matrix. Will probably have to use Docker Buildx to support non-x64 archs. If you’re in a hurry, a PR is always welcome

@rickardp
Copy link
Owner

rickardp commented Oct 5, 2022

AMD64 should be solved in version 1.2.5. I'd really want to solve ARM64 but binary wheel cross compilation seems like a deep rabbit hole that I want to avoid. I'm betting on GitHub to support Apple Silicon and Linux Arm64 soon, now that Azure has ARM64 support.

@rickardp
Copy link
Owner

rickardp commented Oct 5, 2022

Closing this as fixed, feel free to open a ticket for Arm64 if needed.

@rickardp rickardp closed this as completed Oct 5, 2022
@matthewdeanmartin
Copy link
Author

woot! Thanks

@rickardp
Copy link
Owner

rickardp commented Oct 5, 2022

1.2.6 adds ARM64 support, since I had to switch to Docker anyway and can thus use docker multiarch/qemu. Apple Silicon still needs GitHub support AFAIK so that is for a later release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants