-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cpython toolchains #618
Conversation
Would it be possible to add python 3.7 since it still appears to be within it's lifespan https://www.python.org/dev/peps/pep-0537/ ? (also, thank you so much for working on this, this feature would be amazing!!) |
It also seems there's Windows binaries available? Could those be added as well? |
Also, it'd be awesome to have the binaries added to the google mirror |
|
thanks for doing this! one thing i wanted to note - I tried using these toolchains on my machine, and found they don't work with |
@sikko-mode-25 would be good to get details of "don't work". What's the error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff.
I think we should be sure to support Windows if we're gonna land this. I'm in progress on downloading a Windows VM (when will Github Codespaces support the Windows platform, eugh).
python/repositories.bzl
Outdated
integrity = TOOL_VERSIONS[version][rctx.attr.platform], | ||
output = release_filename, | ||
) | ||
unzstd = rctx.which("unzstd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bit that doesn't feel production-ready. We either inspect the host system for the needed binary, or download the source and compile it on the host. (Do Windows machines have make
available by default?)
So if we don't want to adopt this non-hermeticity, what options do we have?
- Try and land non-zstd tarball distributions in the python-standalone repo. Bazel natively supports regular tarballs.
- Do we know if Gregory is against using it?
- Get zstd support landed in Bazel. Support zstd decompression for external dependencies bazel#11968 is an old PR, but there's been activity from Googlers on it in the last 18 days (mid-Jan 2022) and it seems like it might be mergeable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no different than using gzip
with tar
. I don't see anyone worried about which version of gzip
the system is using to decompress a .tar.gz
file (at least not at this level).
I.e. if unzstd
exits code 0, it successfully extracted the software we want - and that is deterministic. For Windows we can follow a similar approach to compiling it without Make if it's not supported on that system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just FYI internally we compile a zstd cc_binary that we use to extract the python toolchain http download.
So making zstd binary configurable is very much desirable.
Let me know if you are interested in the BUILD file for zstd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sluongng Interesting, it would change a bit when the files get extracted but I may be able to make it work? If you could share it, I'd appreciate it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zstd
decompression isn’t necessary since this issue was actioned, which added .tar.gz
release artefacts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for using tarballs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up @jheaff1, that's awesome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the latest release publishes the 'install only' tarballs that we want, for Python 3.9 and 3.10, and each of those for Windows, Apple, and Linux.
@f0rmiga the 'install only' tarballs will remove a decent amount of complexity and non-hermeticity. Let's give them a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up that I'm getting hands on with the 'install only' tarballs over in thundergolfer/example-bazel-monorepo#54 and documenting my troubles.
Very cool. I use As an FYI, I've (tried to) set-up a similar system and wanted to list a few cases here that we ran in to.
I'm sure other options exist too (including just documenting this case), but wanted to highlight it.
This one I haven't put in a solution for yet. But my next step is to look at the auto-configured x-code rules and see if I can detect if the proper version of Xcode is installed in the repository rule. Again, other options likely exist too, and maybe this is only worth documenting.
Not asking for any changes in particular in this PR, but hope the above context is useful for gotchas with a similar system! |
@schultetwin thanks for the points!
|
python/repositories.bzl
Outdated
|
||
filegroup( | ||
name = "files", | ||
srcs = glob(["bin/**", "lib/**", "include/**", "share/**"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be sure to exclude __pycache__
and *.pyc
files as that may introduce cache busts in production systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow why the pre-compiled modules would introduce cache busts? Could you expand, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In older versions of Python the generated bytecode was non-deterministic, but given we are fetching prebuilt versions this may be a non-issue as they wouldn't change, unless Python will continually write new .pyc
files here?
We should add a test against the SHA of these files perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was working on a python repo with bazel and debugging the action cache in order to understand why any test target was being rebuilt for the same git SHA I noticed, that during each github action run where I would do bazel test //...
the tests were rerun because of .pyc
and __pycache__
files included in the filegroup.
I used #551 (comment) as an inspiration.
This was with the most recent Python 3.9 build from indygreg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what changed within those files? My guess is that the interpreter does indeed update these files again and they contain timestamps or absolute paths.
We should exclude RECORD
files for the same reason then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find the logs right now, but my guess was that:
- We download the python interpreter files and extract them.
- We run pip_parse, which uses
bin/python
from the repository rule via thepython_interpreter_target
attribute. - This causes extra
.pyc
files being created during thepip_parse
- Since the
.pyc
files are non-deterministic, we re-execute the Python test targets every time.
Having a minimal example should not be too difficult, but unfortunately I don't have time right now to help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aignas could you try the latest state of this PR? I made the lib
directory read-only, so it should be immutable. I wonder if you will still get the same error.
Coincidentally, we recently started using indygreg/python-build-standalone's releases to define our Python toolchain (in @grailbio's internal repository). We hadn't yet had the time to make a PR here, so I'm excited to see this. Thanks @f0rmiga! I've also experimented with using the artifacts from that standalone release to build self-contained, single-file Python executables (like .par, but also including the interpreter + stdlib). I've extracted a minimal example of this in grailbio/py_singlefile_binary; it defines a very slightly modified Python main, links with libpython, and uses itself as the import path because it is also a PEP 441 .zip. Would you be open to expanding file visibility in the standalone Python repo's BUILD, to I'm also curious if there's interest in rules_python supporting self-contained executables in the future. Our interpreter+.zip approach works so far for some internal tools, though there may be better ways to accomplish this. @sluongng mentioned github.com/indygreg/PyOxidizer to me, for example. (Of course that'd be separate PRs/issues, I just thought I'd mention it here where there's context.) |
4e1abe7
to
7427825
Compare
@josh-newman thanks for the pointers too! For PEP 441, we've considered a few options so far. Like you said, it's out of context for this PR but I can definitely export more things from the downloaded interpreter. |
@f0rmiga How are you dealing with the Python stub template depending on system Python? bazelbuild/bazel#8685 (comment) This seems like an elegant solution for inside containers, but ideal for developers locally: https://github.com/buildbarn/bb-remote-execution/blob/master/cmd/fake_python/main.go |
@josh-newman it's something that would be great to provide a good default for, but building a good solution can be done in a separate repo first I think. Thanks for open sourcing grailbio/py_singlefile_binary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For greater visibility, I've got a branch that adds windows support and uses tar.gz
files at #628 hopefully that helps 😄
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Co-authored-by: UebelAndre <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
72974af
to
2258d33
Compare
allow skipping py2 installation
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
* Fix windows acceptance tests * test * todo: remove Co-authored-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put some suggestions in #642
in particular, fixing the "target_compatible_with" that doesn't belong, matching bazel-contrib/rules-template#9
otherwise looks great to me!
* Minor code review suggestions * Apply suggestions from code review Co-authored-by: Thulio Ferraz Assis <[email protected]>
Signed-off-by: Thulio Ferraz Assis <[email protected]>
🎉 amazing work! |
Awesome 👍 |
Thank you for the work. I tried to use it on my Mac and everything worked except for chmoding the extracted Python folder. Since I am the only one in here facing this issue, it could be something to do with my system, but wanted to give a heads up in case anyone else encounters this: bed8c1b#r68477007 |
PR Checklist
Please check if your PR fulfills the following requirements:
PR Type
What kind of change does this PR introduce?
Does this PR introduce a breaking change?
Other information
Fixes #293.