Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unhandled AttributeError during pex bootstrapping with PEX_PATH #598

Closed
kwlzn opened this issue Oct 12, 2018 · 9 comments · Fixed by #613
Closed

unhandled AttributeError during pex bootstrapping with PEX_PATH #598

kwlzn opened this issue Oct 12, 2018 · 9 comments · Fixed by #613
Assignees
Labels

Comments

@kwlzn
Copy link
Contributor

kwlzn commented Oct 12, 2018

I don't have a minimal repro yet, but in some internal code that uses PEX_PATH with two other pex files that contain namespace packages we seem to be seeing the following unhandled exception on startup:

Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 364, in execute
  File ".bootstrap/_pex/pex.py", line 90, in _activate
  File ".bootstrap/_pex/environment.py", line 147, in activate
  File ".bootstrap/_pex/environment.py", line 198, in _activate
  File ".bootstrap/_pex/environment.py", line 64, in update_module_paths
AttributeError: '_NamespacePath' object has no attribute 'insert'

I think this points to a setuptools version problem, but I've ensured uniform alignment across setuptools==33.1.1 in all 3 involved pex files + build-time interpreter with no luck.

@jsirois
Copy link
Member

jsirois commented Oct 16, 2018

This is not a setuptools symbol, its a python 3 symbol via https://www.python.org/dev/peps/pep-0420/
On my machine, for example I find:

$ find /usr/lib/python{2.7,3.4,3.5,3.6,3.7}/ -type f -name "*.py" | xargs grep "class _NamespacePath"
/usr/lib/python3.4/importlib/_bootstrap.py:class _NamespacePath:
/usr/lib/python3.5/importlib/_bootstrap_external.py:class _NamespacePath:
/usr/lib/python3.6/importlib/_bootstrap_external.py:class _NamespacePath:
/usr/lib/python3.7/importlib/_bootstrap_external.py:class _NamespacePath:

These things become the __path__ value only when python3-style implicit namespace packages are detected (intervening dirs with no __init__.py): https://www.python.org/dev/peps/pep-0420/
For normal packages, __path__ is a plain old list - which has insert.

So... I think you must be running:

  1. Under python3
  2. In a project that purposefully (or inadvertantly?!) uses implicit namespace packages.

@jsirois jsirois self-assigned this Oct 16, 2018
@jsirois
Copy link
Member

jsirois commented Oct 18, 2018

Alright - this seems to be right and works with both classic pkg_resources namespace packages and python 3.3+ implicit _NamespacePath namespace packages (which support path iteration):

@@ -53,15 +53,19 @@ class PEXEnvironment(Environment):
     return explode_dir
 
   @classmethod
-  def update_module_paths(cls, new_code_path):
+  def update_module_paths(cls, pex_file, explode_dir):
     # Force subsequent imports to come from the .pex directory rather than the .pex file.
-    TRACER.log('Adding to the head of sys.path: %s' % new_code_path)
-    sys.path.insert(0, new_code_path)
-    for name, module in sys.modules.items():
-      if hasattr(module, "__path__"):
-        module_dir = os.path.join(new_code_path, *name.split("."))
-        TRACER.log('Adding to the head of %s.__path__: %s' % (module.__name__, module_dir))
-        module.__path__.insert(0, module_dir)
+    TRACER.log('Adding exploded pex dir to the head of sys.path: %s' % explode_dir)
+    sys.path.insert(0, explode_dir)
+    pex_path = os.path.realpath(pex_file)
+    for name, module in list(sys.modules.items()):
+      if hasattr(module, '__path__'):
+        for item in module.__path__:
+          if os.path.realpath(item).startswith(pex_path):
+            TRACER.log('Un-importing module %s initially imported from within zipped pex %s'
+                       % (name, pex_path), V=2)
+            sys.modules.pop(name)
+            break
 
   @classmethod
   def write_zipped_internal_cache(cls, pex, pex_info):
@@ -203,7 +207,8 @@ class PEXEnvironment(Environment):
     self.update_candidate_distributions(self.load_internal_cache(self._pex, self._pex_info))
 
     if not self._pex_info.zip_safe and os.path.isfile(self._pex):
-      self.update_module_paths(self.force_local(self._pex, self._pex_info))
+      explode_dir = self.force_local(pex_file=self._pex, pex_info=self._pex_info)
+      self.update_module_paths(pex_file=self._pex, explode_dir=explode_dir)
 
     all_reqs = [Requirement.parse(req) for req in self._pex_info.requirements]
 

I'm working on a unit test...

@jsirois
Copy link
Member

jsirois commented Oct 19, 2018

As a small piece of the test puzzle, I have confirmed that a wheel built from:

  1. An old style pkg_resources namespace tree
  2. Using a setup.py that declares namespace_packages

Generates a *nspkg.pth file that is imported on dist activation, from which new style _NamespacePath __path__'s are seeded despite the underlying namespace packages that will-be-loaded later at pex entrypoint handoff being old style.

@kwlzn
Copy link
Contributor Author

kwlzn commented Oct 20, 2018

sorry for the late response - been swamped this week and just noticed your reply.

So... I think you must be running:

Under python3
In a project that purposefully (or inadvertantly?!) uses implicit namespace packages.

indeed we're running under python3 and for the latter definitely inadvertently - I saw this choke on both google and mpl_toolkits related handling non-deterministically.

here's a snippit w/ debug logging (sorry, I meant to include this when circling back w/ a repro):

pex: PEX is not zip safe, exploding to /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0
pex: Adding to the head of sys.path: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0
pex: Adding to the head of encodings.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/encodings
pex: Adding to the head of collections.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/collections
pex: Adding to the head of importlib.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/importlib
pex: Adding to the head of _pex.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/_pex
pex: Adding to the head of ctypes.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/ctypes
pex: Adding to the head of ctypes.macholib.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/ctypes/macholib
pex: Adding to the head of urllib.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/urllib
pex: Adding to the head of email.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/email
pex: Adding to the head of http.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/http
pex: Adding to the head of pkg_resources.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/pkg_resources
pex: Adding to the head of xml.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/xml
pex: Adding to the head of xml.parsers.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/xml/parsers
pex: Adding to the head of pkg_resources.extern.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/pkg_resources/extern
pex: Adding to the head of pkg_resources._vendor.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/pkg_resources/_vendor
pex: Adding to the head of pkg_resources._vendor.packaging.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/pkg_resources/extern/packaging
pex: Adding to the head of distutils.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/distutils
pex: Adding to the head of logging.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/logging
pex: Adding to the head of json.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/json
pex: Adding to the head of mpl_toolkits.__path__: /Users/kwilson/.pex/code/3abed9b0e11a4a550f42f3de7c47d7987a659cd0/mpl_toolkits
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 364, in execute
  File ".bootstrap/_pex/pex.py", line 90, in _activate
  File ".bootstrap/_pex/environment.py", line 147, in activate
  File ".bootstrap/_pex/environment.py", line 198, in _activate
  File ".bootstrap/_pex/environment.py", line 64, in update_module_paths
AttributeError: '_NamespacePath' object has no attribute 'insert'

the only reference to mpl_toolkits I found comes from matplotlib - which declares a conditional implicit namespace package: https://github.com/matplotlib/matplotlib/blob/2eb26ee356d908f5eb31282e077c264ff759be76/lib/mpl_toolkits/__init__.py

@kwlzn
Copy link
Contributor Author

kwlzn commented Oct 20, 2018

here's a minimal repro (using an older pex36 release, but should still be helpful):

[omerta show]$ pex36 matplotlib==2.0.2 -o matplotlib.pex --not-zip-safe
[omerta show]$ pex36 requests -o requests.pex --not-zip-safe
[omerta show]$ pex36 flask -o flask.pex --not-zip-safe
[omerta show]$ PEX_PATH=./matplotlib.pex:./requests.pex ./flask.pex
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 359, in execute
  File ".bootstrap/_pex/pex.py", line 87, in _activate
  File ".bootstrap/_pex/environment.py", line 135, in activate
  File ".bootstrap/_pex/environment.py", line 182, in _activate
  File ".bootstrap/_pex/environment.py", line 63, in update_module_paths
AttributeError: '_NamespacePath' object has no attribute 'insert'

the odd bit to me here is the exact positional requirement of the repro - passing only a singular, seemingly problematic pex on PEX_PATH does not repro:

[omerta show]$ PEX_PATH=./matplotlib.pex ./flask.pex
Python 3.6.0 (default, Feb 13 2017, 11:02:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> 
now exiting InteractiveConsole...

nor does reordering:

[omerta show]$ PEX_PATH=./requests.pex:./matplotlib.pex ./flask.pex
Python 3.6.0 (default, Feb 13 2017, 11:02:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> 
now exiting InteractiveConsole...

nor does unspecifying --not-zip-safe:

[omerta show]$ pex36 matplotlib==2.0.2 -o matplotlib.pex
[omerta show]$ pex36 requests -o requests.pex
[omerta show]$ pex36 flask -o flask.pex
[omerta show]$ PEX_PATH=./matplotlib.pex:./requests.pex ./flask.pex
Python 3.6.0 (default, Feb 13 2017, 11:02:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> 
now exiting InteractiveConsole...

(or doing that to any 1-2 of the 3..all 3 must be marked zip_safe=False fwict to repro)

@jsirois
Copy link
Member

jsirois commented Oct 20, 2018

Thanks Kris! Yeah, so all that is needed:

  1. runtime is python3.5+
  2. an item on the pex path contains a distribution with namespace packages
  3. a later item on the pex path is not zip-safe

Item 1 puts implicit namespace packages in play
Item 2 gets _NamespacePath namespace package __path__'s installed in one or more sys.modules - and this doesn't actually require implicit namespaces, just a dist-info with namespace_packages.txt.
Item 3 triggers the pex explode code that assumes __path__ is a list with an insert method.

We have:

$ unzip -qc matplotlib-2.0.2-cp36-cp36m-manylinux1_x86_64.whl matplotlib-2.0.2.dist-info/namespace_packages.txt
mpl_toolkits
mpl_toolkits

As such, the example can be shrunk to:

$ ./dist/pex36 matplotlib==2.0.2 -o matplotlib.pex
$ ./dist/pex36 requests -o requests.pex --not-zip-safe
$ PEX_PATH=requests.pex ./matplotlib.pex 
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 349, in execute
  File ".bootstrap/_pex/pex.py", line 90, in _activate
  File ".bootstrap/_pex/environment.py", line 174, in activate
  File ".bootstrap/_pex/environment.py", line 229, in _activate
  File ".bootstrap/_pex/environment.py", line 86, in update_module_paths
AttributeError: '_NamespacePath' object has no attribute 'insert'

@jsirois
Copy link
Member

jsirois commented Oct 20, 2018

Super compact version of all this that puts the key factors outlined above in play:

$ ./dist/pex36 twitter.common.lang==0.3.9 -o tcl.pex
$ PEX_FORCE_LOCAL=1 PEX_PATH=tcl.pex ./tcl.pex 
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 349, in execute
  File ".bootstrap/_pex/pex.py", line 90, in _activate
  File ".bootstrap/_pex/environment.py", line 174, in activate
  File ".bootstrap/_pex/environment.py", line 229, in _activate
  File ".bootstrap/_pex/environment.py", line 86, in update_module_paths
AttributeError: '_NamespacePath' object has no attribute 'insert'

@jsirois
Copy link
Member

jsirois commented Oct 21, 2018

Noting some problems that make it hard to test the failing code, whose intent is to ensure any namespace package with components internal to a zipped pex are subsequently imported from the exploded zip contents:

  1. load_internal_cache, which is called 1st in the PEXEnvironment activation process, unpacks all pex .deps/ to disk, leaving only the loose source in a zipped pex o be __path__ adjusted by the original code.
  2. Loose code added to a pex (via PEXBuilder) w/o __init__.py has them automatically added by _prepare_inits - as such new-style implicit namespace packages in PEX loose sources are impossible today.
  3. There is a bug in PEXEnvironment activation where new style implicit _NamespacePath namespace packages are wiped out by pkg_resources.

I do have a fix for 3 though and this allows a test for the case the broken code was attempting to handle. That case is legit! Without a fix to the broken code in this issue's backtrace (and a fix to 3) otherwise legitimate pex loose source code with namespaces that is not zip safe is in fact broken by pex.

Fixes coming...

@jsirois
Copy link
Member

jsirois commented Oct 23, 2018

A further issue here is varying behavior in setuptools (pkg_resources) handling PEP-420 namespace packages. At the low end of our current compatibility range for setuptools, pkg_resources blows up for similar reasons, expecting __path__ to support sort. Later, it ignores existing __path__ items that are _NamespacePath, nuking them. In more recent versions, it interacts gracefully with them. This may force #607 as a pre-requisite for sanity sake.

jsirois added a commit to jsirois/pex that referenced this issue Oct 29, 2018
Previously, pex would blow up trying to adjust import paths if a PEP-420
implicit namespace package was encountered. Add a test of the path
adjustment functionality both to confirm it was needed (it was) and that
the fix technique works with all forms of namespace packages by only
assuming they support `append`.

Fixes pex-tool#598
This was referenced Oct 29, 2018
kwlzn added a commit that referenced this issue Nov 15, 2018
cosmicexplorer added a commit to pantsbuild/pants that referenced this issue Nov 16, 2018
### Problem

See pex-tool/pex#623 for the pex release ticket. This release incorporates two changes, one of which (pex-tool/pex#599) was necessary to fix the issue in pex-tool/pex#598.

### Solution

Bump the pex version to `1.5.3`.
cosmicexplorer added a commit to pantsbuild/pants that referenced this issue Nov 16, 2018
See pex-tool/pex#623 for the pex release ticket. This release incorporates two changes, one of which (pex-tool/pex#599) was necessary to fix the issue in pex-tool/pex#598.

Bump the pex version to `1.5.3`.
jsirois added a commit to jsirois/pex that referenced this issue Nov 20, 2018
Previously, pex would blow up trying to adjust import paths if a PEP-420
implicit namespace package was encountered. Add a test of the path
adjustment functionality both to confirm it was needed (it was) and that
the fix technique works with all forms of namespace packages by only
assuming they support `append`.

Fixes pex-tool#598
jsirois added a commit to jsirois/pex that referenced this issue Nov 26, 2018
Previously, pex would blow up trying to adjust import paths if a PEP-420
implicit namespace package was encountered. Add a test of the path
adjustment functionality both to confirm it was needed (it was) and that
the fix technique works with all forms of namespace packages by only
assuming they support `append`.

Fixes pex-tool#598
jsirois added a commit to jsirois/pex that referenced this issue Nov 26, 2018
Previously, pex would blow up trying to adjust import paths if a PEP-420
implicit namespace package was encountered. Add a test of the path
adjustment functionality both to confirm it was needed (it was) and that
the fix technique works with all forms of namespace packages by only
assuming they support `append`.

To support all this, work around a pypy zipimporter bug:
https://bitbucket.org/pypy/pypy/issues/1686 and, as a side-effect,
eliminate our CI workaround for the pypy unit test shard.

Fixes pex-tool#598
Fixes pex-tool#497
jsirois added a commit that referenced this issue Nov 28, 2018
Previously, pex would blow up trying to adjust import paths if a PEP-420
implicit namespace package was encountered. Add a test of the path
adjustment functionality both to confirm it was needed (it was) and that
the fix technique works with all forms of namespace packages by only
assuming they support `append`.

To support all this, work around a pypy zipimporter bug:
https://bitbucket.org/pypy/pypy/issues/1686 and, as a side-effect,
eliminate our CI workaround for the pypy unit test shard.

Fixes #598
Fixes #497
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants