-
-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optionally validate entry point at build time #521
Conversation
Hi @kwlzn please kindly take an initial look for sanity. I am still working on tests, so will ping when it's fully ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a useful feature but I think there are a few fundamental issues here to address.
pex/pex_builder.py
Outdated
@@ -238,6 +303,7 @@ def set_entry_point(self, entry_point): | |||
The entry point may also be specified via ``PEXBuilder.set_executable``. | |||
""" | |||
self._ensure_unfrozen('Setting an entry point') | |||
self._verify_entry_point(entry_point) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a builder, order of building is not guaranteed and you can't verify until after a freeze since not all requirements or files may yet be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any pointer for a better place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In freeze
, where you know there can be no more code or requirement additions.
pex/pex_builder.py
Outdated
.format(ep_method, file_candidates)) | ||
|
||
def _gather_file_candidates(self, entry_point_elements): | ||
# 1. Find the files matching a.b.c:m, which is a/b/c.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be a/b/c/__init__.py
, ... or be a dynamically generated module! I think you really have to try to import the symbol after a freeze to avoid all the corner cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importing the code feels quite sketchy, even out of process, because they could have bad side effect or even security concerns.
How do you feel about putting verifying entry point behind an option, stating that it would only check for static method in
a/b/c.py
a/b/c/__init__.py
and allow people to skip static check by disabling it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If validation is opt-in, it seems like import would be the way to go. The user is going to run the pex after all and suffer the side-effects soon anyhow when they do so! Basically, this is forced to be a half measure one way or the other, so the choice comes down to avoiding side-effects at build time (they'll happen at runtime) for the X% of cases that have this pathology, or not supporting Y% of cases that have bizarre dynamic module generation. I'm really not sure what these percentages are or what is the better way to go. Import just seems attractive since it's robust: ie it's what pex does at runtime. Parsing is new logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. Importing could also get around ast parsing difference between python 2 and 3.
So I am going to do the importing, and then put it behind an option with default disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsirois ready for a second round sanity check. test still incoming.
pex/pex_builder.py
Outdated
if bytecode_compile: | ||
self._precompile_source() | ||
self._frozen = True | ||
|
||
def build(self, filename, bytecode_compile=True): | ||
def build(self, filename, bytecode_compile=True, verify_entry_point=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the interface pants will call, correct?
changing it in a way to keep backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but a cleanup suggestion re subprocess / Executor and a bigger change idea to consider.
pex/pex_builder.py
Outdated
else: | ||
raise self.InvalidEntryPoint("Fail to parse: `{}`".format(entry_point)) | ||
|
||
args = [self._interpreter.binary, '-c', import_statement] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Execution via subprocess was consolidated to Executor
, so this code should use that instead of subprocess
directly.
A bit more true to the runtime environment the entry point will be executed in would be something like:
with named_temporary_file() as fp:
fp.write(import_statement)
fp.close()
result = PEX(self.chroot().path()).run([fp.name], env={'PEX_INTERPRETER': '1'})
if result != 0:
raise self.InvalidEntryPoint('Failed to validate entrypoint {}'.format(entry_point))
This though unfortunately creates a circular PEX <-> PEXBuilder dep. Better I think to lift the check to the PEX constructor which could be plumbed both from pex.bin.pex and from API users (Pants).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the verification step into class PEX(object)
:
def __init__(self, ..., verify_entry_point=False):
pex/pex_builder.py
Outdated
if bytecode_compile: | ||
self._precompile_source() | ||
self._frozen = True | ||
|
||
def build(self, filename, bytecode_compile=True): | ||
def build(self, filename, bytecode_compile=True, verify_entry_point=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the one non-isort/non-style CI shard failure is legit but not related to this PR.
LGTM mod one latent bug pointed out re not using the builder's interpreter to perform the import check and completion of test work.
pex/pex.py
Outdated
ep_method = ep_split[1] | ||
import_statement = 'from {} import {}'.format(ep_module, ep_method) | ||
else: | ||
raise InvalidEntryPoint("Fail to parse: `{}`".format(entry_point)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failure or Failed would make more sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
pex/bin/pex.py
Outdated
@@ -741,7 +753,7 @@ def main(args=None): | |||
pex_builder.freeze() | |||
|
|||
log('Running PEX file at %s with args %s' % (pex_builder.path(), cmdline), v=options.verbosity) | |||
pex = PEX(pex_builder.path(), interpreter=pex_builder.interpreter) | |||
pex = PEX(pex_builder.path(), interpreter=pex_builder.interpreter, verify_entry_point=options.validate_ep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
pex/bin/pex.py
Outdated
default=False, | ||
action='store_true', | ||
help='Validate the entry point by importing it in separate process. Warning: this could have ' | ||
'side effect. For example, entry point `a.b.c:m` will translate to ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
pex/bin/pex.py
Outdated
@@ -732,6 +741,9 @@ def main(args=None): | |||
tmp_name = options.pex_name + '~' | |||
safe_delete(tmp_name) | |||
pex_builder.build(tmp_name) | |||
if options.validate_ep: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could be more uniform, more correct and fail faster with:
pex_builder = build_pex(reqs, options, resolver_options_builder)
pex_builder.freeze()
pex = PEX(pex_builder.path(), interpreter=pex_builder.interpreter, verify_entry_point=options.validate_ep)
if options.pex_name is not None:
...
As things stand, in the options.pex_name branch, the pex is zipped up before the entrypoint validation which could be wasted work. Worse though is that the builder's custom interpreter, if any, is not used. This could be a problem if the current running interpreter does not match what the built pex needs and syntax differs (a 2/3 split).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved things up to fail fast.
pex/pex.py
Outdated
@@ -520,3 +522,31 @@ def run(self, args=(), with_chroot=False, blocking=True, setsid=False, **kwargs) | |||
stderr=kwargs.pop('stderr', None), | |||
**kwargs) | |||
return process.wait() if blocking else process | |||
|
|||
def do_entry_point_verification(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given comments above re uniformity in performing the validation, I'd make this _private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
looks like
executed the entry point (print hello) rather than
as suggested earlier. |
Updated. Tests incoming. |
Tests added. Ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yi!
There was no proncipled reason to use it. Neither the originating commit or its originiating issue refer to Pillow and the tests themselves don't implicate anything being required except a 3rdparty lib. See: + pex-tool#521 + pex-tool#508
Resolves #508
Problem
Current behavior allows building a pex that cannot execute:
Solution
Verify entry point at build time. E.g.
a.b.c:m
means we will try to dofrom a.b.c import m
in a separate process.Result