-
-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add include_sources
to pex_binary
target
#16215
Conversation
FYI @jsirois the PEX-meister. |
include_sources
to pex_binary
target
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍🏽
default = True | ||
help = softwrap( | ||
""" | ||
Whether to include your first party sources the binary uses in the packaged PEX file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think expanding this doc with why this could be beneficial would make sense. It could be the difference of "what, why!?" to "oh, aha!!" for a novice user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning on a blog post later, then linking it in the docker docs, and probably here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that explaining how this is used and actually sketching an example might be good. It might help clarify that there are other UXs available to solve the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will say this value mirrors include_requirements
directly, which also doesn't include potential use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely non-blocking comment.
I have no idea what the UX would look like, but is this something that could/should be done natively instead, maybe by something like a docker_pex_image
, or a multi_stage_pex_binary
target?
It's a bit odd, because currently our expectation is that the package
goal produces roughly one file... although it can totally produce more. And so a multi-stage PEX build could produce multiple files with well known ... suffixes, maybe?
Also, the packed
layout is potentially useful for this kind of usecase... I wonder if it would be possible to pull in the relevant portions of that layout into the two layers, rather than building two independent files?
Would be interested in John's thoughts, but again: not blocking.
default = True | ||
help = softwrap( | ||
""" | ||
Whether to include your first party sources the binary uses in the packaged PEX file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that explaining how this is used and actually sketching an example might be good. It might help clarify that there are other UXs available to solve the problem.
Personally, I'd love to see support for "synthetic" targets, that way we expose the low-layer API like this (for this use-case and others), but also a plugin which can synthetically create these targets and build more interesting ones (while still having minimal rule piping). I've thought a lot about this very particular issue and landed on this change because we're exposing all the bits necessary to do the interesting thing, and allowing devs to take it over from there. It'd be really hard to get the secret sauce just right on a Using
I don't think this is the end of the road for this line of optimization, but is certainly a useful stepping stone we can provide today. Once we can leverage the facets of this design, we can see where the best next step is for Pants and/or PEX to make it simpler. |
At first glance this might seem like a nonsensical yin to `include_requirements`'s yang. However... that's exactly what it is (minus the nonsensical part). Consider the multi-stage build documented here: https://pex.readthedocs.io/en/latest/recipes.html#pex-app-in-a-container. Now consider if each stage consumed not a single all-in-one PEX, but the `deps` stage used a `PEX` build with `include_requirements=True` and `include_sources=False`. Likewise, but flipped, for the `srcs` stage. The `COPY` instruction in each stage wouldn't be invalidated unless truly something going into that stage changed. For PEXs with large reqs, this cache re-use can save a lot of time, as the compilation of `deps` might take a long time. [ci skip-rust] [ci skip-build-wheels]
…16252) At first glance this might seem like a nonsensical yin to `include_requirements`'s yang. However... that's exactly what it is (minus the nonsensical part). Consider the multi-stage build documented here: https://pex.readthedocs.io/en/latest/recipes.html#pex-app-in-a-container. Now consider if each stage consumed not a single all-in-one PEX, but the `deps` stage used a `PEX` build with `include_requirements=True` and `include_sources=False`. Likewise, but flipped, for the `srcs` stage. The `COPY` instruction in each stage wouldn't be invalidated unless truly something going into that stage changed. For PEXs with large reqs, this cache re-use can save a lot of time, as the compilation of `deps` might take a long time. [ci skip-rust] [ci skip-build-wheels]
Do you have numbers? Another approach if it truly is the compilation time is amending the advice over at Pex recipes to not compile the deps layer, then in a new layer, |
Thanks @thejcannon for providing the PEX-INFO offline. This helped me build an apples to apples repro for timing sake. I've attached the requirements.txt distilled from that PEX-INFO and I used the following lock to (re-)build PEXes from for the tests:
I then just used an initially empty
Ok, so, using existing technology you can eliminate the time taken to bytecode-compile 3rdparty deps. Amending the Pex recipe with a new FROM python:3.8-slim as deps
COPY /my-app.pex /
RUN PEX_TOOLS=1 /usr/local/bin/python3.8 /my-app.pex venv --scope deps --rm all /my-app
FROM python:3.8-slim as compiled-deps
COPY --from=deps /my-app /my-app
RUN /my-app/bin/python -m compileall /my-app
FROM python:3.8-slim as srcs
COPY /my-app.pex /
RUN PEX_TOOLS=1 /usr/local/bin/python3.8 /my-app.pex venv --scope srcs --rm all --compile /my-app
FROM python:3.8-slim
COPY --from=compiled-deps /my-app /my-app
COPY --from=srcs /my-app /my-app
ENTRYPOINT ["/my-app/pex"] You observe the expected cache hit / skip of compilation on a sources-only Pex change:
That's not super-enlightening re timings though. The BuildKit makes this better and also allows leveraging a persistent That Dockerfile: # syntax=docker/dockerfile:1.2
FROM python:3.8-slim as deps
RUN \
--mount=type=cache,target=/pex-root \
--mount=target=/context \
PEX_ROOT=/pex-root \
PEX_TOOLS=1 \
/usr/local/bin/python3.8 /context/my-app.pex venv --scope deps /my-app
FROM python:3.8-slim as compiled-deps
COPY --from=deps /my-app /my-app
RUN /my-app/bin/python -m compileall /my-app
FROM python:3.8-slim as srcs
RUN \
--mount=type=cache,target=/pex-root \
--mount=target=/context \
PEX_ROOT=/pex-root \
PEX_TOOLS=1 \
/usr/local/bin/python3.8 /context/my-app.pex venv --scope srcs --compile /my-app
FROM python:3.8-slim
COPY --from=compiled-deps /my-app /my-app
COPY --from=srcs /my-app /my-app
ENTRYPOINT ["/my-app/pex"] And the similar build result after a sources-only change:
Note that BuildKit is smart enough to run the This may not be a good enough result to undo the utility of the new There are even more ways to do this of course. On a Linux host the venv could be pre-split, for example into |
Aside from mounting (which I'll have to museum about) we have similar findings. I also mused extracting the venv on the host and COPYing it, but decided against it since it bakes in additional assumptions. In general, I think there's PEX the tool, with all it's bells and whistles and levers and knobs. Then there's PEX, Pants' Python courier. In this instance "the user" doesn't care about the PEX as anything but transportation from repo to container. Maybe it's unfortunate we put all use cases in a single target? |
Perhaps, yes. That's what I was getting at with:
If your intent as a Pants user is to get a runnable venv that accurately reflects tested code installed in a Docker container all using Pants, it seems having to reference a |
At first glance this might seem like a nonsensical yin to `include_requirements`'s yang. However... that's exactly what it is (minus the nonsensical part). Consider the multi-stage build documented here: https://pex.readthedocs.io/en/latest/recipes.html#pex-app-in-a-container. Now consider if each stage consumed not a single all-in-one PEX, but the `deps` stage used a `PEX` build with `include_requirements=True` and `include_sources=False`. Likewise, but flipped, for the `srcs` stage. The `COPY` instruction in each stage wouldn't be invalidated unless truly something going into that stage changed. For PEXs with large reqs, this cache re-use can save a lot of time, as the compilation of `deps` might take a long time. [ci skip-rust] [ci skip-build-wheels]
At first glance this might seem like a nonsensical yin to
include_requirements
's yang. However... that's exactly what it is (minus the nonsensical part).Consider the multi-stage build documented here: https://pex.readthedocs.io/en/latest/recipes.html#pex-app-in-a-container.
Now consider if each stage consumed not a single all-in-one PEX, but the
deps
stage used aPEX
build withinclude_requirements=True
andinclude_sources=False
. Likewise, but flipped, for thesrcs
stage. TheCOPY
instruction in each stage wouldn't be invalidated unless truly something going into that stage changed.For PEXs with large reqs, this cache re-use can save a lot of time, as the compilation of
deps
might take a long time.[ci skip-rust]
[ci skip-build-wheels]