Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2.py] ./pants run implementation can be problematic in the face of application re-execs #4682

Closed
kwlzn opened this issue Jun 16, 2017 · 6 comments

Comments

@kwlzn
Copy link
Member

kwlzn commented Jun 16, 2017

in the v2 python backend, the run task implementation now uses PEX_PATH to compose a pex environment at runtime.

we had a particular case in our monorepo where a user was attempting to use ./pants run in conjunction with a development server that would re-exec itself when e.g. files changed on disk. in the v1 backend, this worked fine as everything was self-contained in a singular pex that could be neatly re-executed with it's full environmental context intact. in the v2 backend, this mode of re-execution now breaks with import errors because the PEX_PATH composition context is not self-contained (i.e. after a re-exec, it may be missing the entire sources pex environment).

seems like we'll need to figure out a strategy to make self-contained re-execs possible again - and potentially also a flag to enable run to run directly from sources vs a composed pex to support the "live development" workflow. cc @benjyw in case he has any ideas on either of these.

@kwlzn kwlzn added the python label Jun 16, 2017
@benjyw
Copy link
Contributor

benjyw commented Jun 17, 2017

Can you walk me through an example of the problem? What does this development server do?

@kwlzn
Copy link
Member Author

kwlzn commented Jun 22, 2017

it's a thirdparty development webserver that reloads itself when it detects filesystem changes. looks like it's doing the equivalent of:

subprocess.call([sys.executable] + sys.argv, env=os.environ.copy())
sys.exit()

but it seems to be losing the PEX_PATH context somehow on re-exec as it throws import errors when restarting. perhaps being scrubbed out of the env?

@kwlzn
Copy link
Member Author

kwlzn commented Jun 22, 2017

I've pushed a minimal repro to https://github.com/twitter/pants/tree/kwlzn/4682_repro

here's what the repro looks like in this branch:

[omerta pants (kwlzn/4682_repro)]$ ./pants -q run examples/src/python/example/flask_autorestart
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
Traceback (most recent call last):
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 360, in execute
    self._wrap_coverage(self._wrap_profiling, self._execute)
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 288, in _wrap_coverage
    runner(*args)
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 320, in _wrap_profiling
    runner(*args)
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 403, in _execute
    return self.execute_entry(self._pex_info.entry_point)
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 461, in execute_entry
    return runner(entry_point)
  File "/Users/kwilson/dev/pants/.pants.d/run/py/CPython-2.7.13/8ab1f9c30ea4ef7599bf82058060eaf373a68a35/.bootstrap/_pex/pex.py", line 466, in execute_module
    runpy.run_module(module_name, run_name='__main__')
  File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/runpy.py", line 170, in run_module
    mod_name, loader, code, fname = _get_module_details(mod_name)
  File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/runpy.py", line 101, in _get_module_details
    loader = get_loader(mod_name)
  File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/pkgutil.py", line 464, in get_loader
    return find_loader(fullname)
  File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/pkgutil.py", line 474, in find_loader
    for importer in iter_importers(fullname):
  File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/pkgutil.py", line 430, in iter_importers
    __import__(pkg)
ImportError: No module named example.flask_autorestart

FAILURE: /Users/kwilson/dev/pants/build-support/pants_dev_deps.venv/bin/python2.7 example.flask_autorestart.main  ... exited non-zero (1)

FAILURE

but note that if you binary and then run the resulting pex, it works fine:

[omerta pants (kwlzn/4682_repro)]$ ./pants -q binary examples/src/python/example/flask_autorestart
[omerta pants (kwlzn/4682_repro)]$ ./dist/flask_autorestart.pex
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
^C[omerta pants (kwlzn/4682_repro)]$

this points to an issue with the pex composition happening in run vs a monolithic pex in the face of a ~typical application restart.

@kwlzn
Copy link
Member Author

kwlzn commented Jul 7, 2017

just hit this same issue while working on a project related to Jupyter notebook + pex when trying to use ./pants run.

in order to spawn kernels (the server-side piece of jupyter execution), we make jupyter aware of the executing pex's environment and point to it with an alternate entrypoint for subprocess launching.

this works fine with ./pants binary but now breaks in exactly the same way as above when used with ./pants run (which is also significantly faster than the full binary rebuild for iteration).

@kwlzn
Copy link
Member Author

kwlzn commented Jul 7, 2017

I think this is related to the environment scrubbing in pex here: https://github.com/pantsbuild/pex/blob/6b3fbdf873a9effe3d8371b7a0939f6027e91ac4/pex/pex.py#L52

which seems to be the correct thing to do, lest we leak PEX_PATH into subprocess execution which could break subprocess execution of other pex files. so if we must scrub the environment here, one other way to skin this cat might be to elevate the PEX_PATH configuration from environment variable into the main pex's metadata (e.g. it's PEX-INFO). then it'd just re-bootstrap the chain as launched via ./pants run as part of any casual re-exec. this would likely be useful only for this specific case, but seems like the cleanest way I can think of.

kwlzn pushed a commit to pex-tool/pex that referenced this issue Sep 28, 2017
…o metadata (#417)

Problem:
Pex environments rely on the PEX_PATH environment variable to resolve modules from other pexes when composing them into a single pex. Python scripts that re-execute (like a server listening for fs changes, a Jupyter notebook) throw an ImportError upon re-exec due to environment scrubbing that removes this PEX_PATH information.

Solution:
Add a --pex-path argument to the pex client and plumb the data through to the pex-info metadata. Resolve pexes to compose into the current pex being built by reading from the new pex_path property of the pex-info metadata. This will ensure a self-contained environment that does not lose context on a python script re-exec.

Relates to: pantsbuild/pants#4682
@jsirois jsirois self-assigned this Jul 15, 2018
@jsirois
Copy link
Contributor

jsirois commented Jul 15, 2018

Oops - fixed by pex-tool/pex@5c663c8 and #4969

Just never closed.

@jsirois jsirois closed this as completed Jul 15, 2018
@jsirois jsirois removed their assignment Jul 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants