-
-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve wheel support in pex. #388
Conversation
Ping? It's been a week - any chance anyone can look at this soon? @jsirois? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a reasonable first step towards wheel installation improvements, mod one comment.
if possible, it would also be nice to ensure that pex(1)
-built pex files also net this improvement (they currently do not appear to afaict) so that there's consistency between pex and pants for e.g. testing and repros.
pex/pex_builder.py
Outdated
# wheel dir. | ||
if dist_name.endswith("whl"): | ||
wf = WheelFile(path) | ||
wf.install(overrides=self._get_installer_paths(dist_name), force=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to me like this conditional block should be mutually exclusive with the zip explosion below.. or else land in an isolated install path?
otherwise you could end up with dirs containing both installed and unzipped wheels, which could be problematic if e.g. inner paths collide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or possibly just short-circuit with return CacheHelper.zip_hash(wf.zipfile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this a bit and noticed that the wheel-installed files will be ignored if they are not also written directly to the chroot via self._chroot.write
. The Chroot class maintains an internal set of written files (by label) in Chroot.filesets
and files that are included in this data structure get written out to the pex zip in Chroot.zip
. So any files installed directly by wheel that are not also installed in the normal path below will be ignored.
There doesn't appear to be a clean way to update chroot.filesets without writing directly, so perhaps the best approach here is to install the wheel into a temporary directory, then read the installed RECORD file and write each entry to the chroot?
pex/pex_builder.py
Outdated
""" | ||
base = os.path.join(self.path(), self._pex_info.internal_cache, dist_name) | ||
return { | ||
'purelib': os.path.join(base), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could drop os.path.join here and just use base
I am unable to get console scripts to work with this wheel install (amended to skip the additional direct zip install). using
while this PR generates
and if I try to build a pex using this console script, I get an error:
again, this is only if I modify the PR to not also unpack the wheel as a zipfile directly). |
Some context on this update: this properly uses WheelFile.install to extract the contents of a wheel, and then copies that into the pex chroot. I think this is a much better solution than the first try. The alternative approach, which I think would work, but be more brittle, would be to add a wheel-based finder (including an override of the ImpImporter from pkgutils). That would allow us to include the wheel, in its original pypi form, in the pex. The catch, and the reason why I prefer this approach, is that we'd basically be reproducing some logic from the pkgutils WheelFile.install in the finder, and that would require us to make sure that we update pex logic after any changes to pkgutils. The approach in this change just uses wheelfile, and then pulls in the lesult. |
Wheel handling in pex files is broken. The wheel standard says that wheel files are not designed to be importable archives, but pex treats them as if they are. This causes many standard compliant wheels, including things like tensorflow and opencv, to fail to import in pexes (and thus in pants). This change modifies wheel handling, so that when a wheel is added to a pex, it's installed in an importable form.
(Code disagreed about whether scripts were stored in /bin or /scripts.)
…ty with python 3.
This passes on my laptop with Python 2.6. Not sure what Jenkins' problem is. |
Nice updates! My local testing is raising exception during build w/ python2 pex:
|
…ation and execution
|
Can you tell me about your py26 environment? I haven't been able to reproduce it, so I'd like to try to figure out what's different.
|
The test failure appears to be an import error -- wheel.install is imported during pex bootstrap and is not available (it isn't part of stdlib).
I'm not sure how pex_builder is used within the packaged pex bootstrap environment, but it seems like you want to only import during build and avoid import at runtime. |
fwiw, I reproduced using tox |
So the import error is a bug that I think needs fixing, but it is not the cause for test failures. The pex artifacts are run within the tox virtualenv which gets 'wheel' installed. So the tests should get past that issue. I think what is happening in the test failures is WheelFile.install is not copying the full script to bin/ . When I inspect the pex artifact, the test scripts ( |
I think that the wheel import is, ultimately, the root cause. The tests that are failing on 2.6 are the tests that rely on the pex generation script successfully running Wheelfile.install; I think that what's going on under the covers is that we're winding up with an invalid pex because the generation process is failing due to wheel not being available on the 2.6 host. |
setup.py
Outdated
@@ -61,6 +61,7 @@ | |||
'twitter.common.lang>=0.3.1,<0.4.0', | |||
'twitter.common.testing>=0.3.1,<0.4.0', | |||
'twitter.common.dirutil>=0.3.1,<0.4.0', | |||
'wheel==0.29.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that WHEEL_REQUIREMENT should now be in install_requires
I dug through a bit more and I'm confident this is actually hitting a bug deep within the interaction between wheel and zipfile on python2.6 . Specifically, when wheel installs a script it attempts to massage the #!python hashbang before processing the rest of the script. The code is here: It does this by attempting to read a single line from the zipfile using I'm not sure what the best solution is here. First, I wonder if you can reproduce this zipfile bug? If so, I think the options are to (1) break pex installs of wheel files using python2.6, (2) write custom code to install the wheel package that does not have this bug, or (3) see if we can get a patch applied to upstream wheel. |
Thanks for taking the time to do that!
I can't reproduce the error on my macbook, but I can using a linux VM, in
which I remove the system python, and install Python2.6 from sources.
I'm hesitant to write custom code for wheel installs to bypass this issue.
It's not difficult, but it's a very brittle solution: we'd wind up with a
parallel version of wheel install, which would need to track changes to the
standard wheel package. That's just begging from trouble down the line, in
order to continue to support a version of Python that's already
unsupported. (One of the problems I've had building an environment to
reproduce this bug locally is that it's hard to install a working 2.6
environment; even homebrew withdrew the formula for building one!)
-Mark
…On Thu, Jun 1, 2017 at 8:18 PM Dana Powers ***@***.***> wrote:
I dug through a bit more and I'm confident this is actually hitting a bug
deep within the interaction between wheel and zipfile on python2.6 .
Specifically, when wheel installs a script it attempts to massage the
#!python hashbang before processing the rest of the script. The code is
here:
https://bitbucket.org/pypa/wheel/src/0.29.0/wheel/install.py?at=default&fileviewer=file-view-default#install.py-346
It does this by attempting to read a single line from the zipfile using
zipfile.readline() . It then reads the remainder of the script using
zipfile.read(). The problem is that the underlying zipfile library on
python2.6 does not handle this sequence properly. You can test very easily
by creating a zipfile with any script, open via z = zipfile.open(filename),
then attempt to read a single line followed by the rest of the script w/
readline() and read(). read() returns nothing! Very strange. This does not
happen on python2.7, likely because the zipfile module was refactored to
fix the underlying issue.
I'm not sure what the best solution is here. First, I wonder if you can
reproduce this zipfile bug? If so, I think the options are to (1) break pex
installs of wheel files using python2.6, (2) write custom code to install
the wheel package that does not have this bug, or (3) see if we can get a
patch applied to upstream wheel.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#388 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AArDqCPMK6TdyGZ20pj6UraaZftz3YC6ks5r_1TugaJpZM4Nbam4>
.
|
So how can we move this forward? We'd really like to have a version of pex that works with the tensorflow distribution wheel. My inclination is not to put resources into supporting a version of Python that no one should be using. Would it be an acceptable compromise to use a check in the wheel code, which signals an error for using wheels with .data directories on Python 2.6? |
I'm just an interested third-party, not a maintainer. But it seems reasonable to me to error, or perhaps just warn, if it seems like someone is trying to use python2.6 to build a pex from a wheel that has console scripts. @kwlzn -- what are your thoughts? |
Also would suggest skipping the test on py2.6:
|
Can anyone help out with the isort-test failures? Isort running locally on my machine shows no errors; and the file that it's complaining about is unchanged in this diff! |
isort 4.2.9+ seems to catch things earlier versions of isort missed. Looks like 4.2.15 is getting installed on Travis, not sure what you have locally. |
Ping? This now includes checks for python versions, to generate a warning on older versions of Python. The build issues are fixed, and all tests pass. |
@MarkChuCarroll apologies for the delay, just catching back up on the pex PR backlog now - will try to take a look at this tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR @MarkChuCarroll! looks pretty good to me. handful of smaller things and maybe one issue - then we can expedite to release.
pex/pex_builder.py
Outdated
# into an importable shape. We can do that by installing it into its own | ||
# wheel dir. | ||
if not self.interpreter.supports_wheel_install(): | ||
print("*** Wheel dependencies may not work correctly with Python 2.6.", file=sys.stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be great to avoid printing directly to std descriptors due to the unexpected output effect on downstream consumers like e.g. pants..
how about using the logging
module - or at a minimum the warnings
module instead? ideally with a one liner message.
pex/pex_builder.py
Outdated
@@ -253,7 +256,48 @@ def _add_dist_dir(self, path, dist_name): | |||
self._copy_or_link(filename, target) | |||
return CacheHelper.dir_hash(path) | |||
|
|||
def _get_installer_paths(self, dist_name, base): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist_name
appears unused here.
pex/finders.py
Outdated
dist.get_resource_string('', script_path).replace(b'\r\n', b'\n').replace(b'\r', b'\n')) | ||
# This can get called in different contexts; in some, it looks for files in the | ||
# wheel archives being used to produce a pex; in others, it looks for files in the | ||
# install wheel directory included in the pex. So we need to look at both locations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this was the only caller of the safer_name
function above, so would kill that along with the removal of it's usage.
pex/pex_builder.py
Outdated
if self.interpreter.supports_wheel_install() and dist_name.endswith("whl"): | ||
from wheel.install import WheelFile | ||
try: | ||
tmp = tempfile.mkdtemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use of pex.common.safe_mkdtemp
is preferred.
pex/pex_builder.py
Outdated
self._chroot.copy(fullpath, target) | ||
finally: | ||
shutil.rmtree(tmp) | ||
return CacheHelper.dir_hash(whltmp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't the parent dir (tmp
) of whltmp
be removed before dir_hash(whltmp)
is called?
assuming this was a bug and since this wasn't caught by existing tests, adding some extra test coverage here might be appropriate here if you're up for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
thanks a ton for the PR @MarkChuCarroll - this will go out in the 1.2.8 release tracked in #389 |
the pex version this went out in is now being consumed in pants master @ pantsbuild/pants@cdb4e5e |
Great work!
On Jul 10, 2017 2:14 PM, "Kris Wilson" <[email protected]> wrote:
this version is now being consumed in pants master @ pantsbuild/pants@
cdb4e5e
<pantsbuild/pants@cdb4e5e>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#388 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAzetGWZaM8G0BNZNpAXkx78j9BThAuSks5sMpQ4gaJpZM4Nbam4>
.
|
Wheel handling in pex files is broken. The wheel standard says
that wheel files are not designed to be importable archives,
but pex treats them as if they are. This causes many standard
compliant wheels, including things like tensorflow and opencv,
to fail to import in pexes (and thus in pants).
This change modifies wheel handling, so that when a wheel
is added to a pex, it's installed in an importable form.