Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pants python thrift building allows no control over the default thrift requirement. #2533

Closed
jsirois opened this issue Nov 10, 2015 · 7 comments

Comments

@jsirois
Copy link
Contributor

jsirois commented Nov 10, 2015

Due to the way the pex resolver works today (like pip, in breadth 1st rounds), the unconstrained thrift dependency pants' PythonChroot (and ThriftBuilder) inject in the 1st resolution round can easily conflict with thrift requirements in later (deeper transitive dep) rounds. The solution for the pex cli or in pip itself is to use the top-level requirements to adjust deps. With pants implicit thrift requirement injection - this option is off the table. As a result, we see errors like this one in apache aurora when trying to upgrade to modern pants/pex (0.0.57/1.1.0) - debugging prints added:

$ git log -1
commit ec61b8b6b054a413bfef22b61a08c77811f74746
Author: Maxim Khutornenko <[email protected]>
Date:   Fri Nov 6 16:47:02 2015 -0800

    Adding build target to build kerberos auth module.

    Reviewed at https://reviews.apache.org/r/40042/

 src/main/python/apache/aurora/kerberos/BUILD          | 11 +++++++++--
 src/main/python/apache/aurora/kerberos/auth_module.py |  2 +-
 2 files changed, 10 insertions(+), 3 deletions(-)
$ ./pants --print-exception-stacktrace clean-all test.pytest --no-fast src/test/python/::
>>> round 1, resolvables:
    ...
    twitter.common.recordio==0.3.3
    api.src.main.thrift.org.apache.thermos.thermos-3cf3cf68==0.0.0
    thrift
    pytest>=2.6,<2.7
    pytest-timeout
    pytest-cov>=1.8,<1.9
    unittest2
    ...
    resolvable set in round 1:
    ...
    _ResolvedPackages(resolvable=ResolvableRequirement(requirement=Requirement.parse('thrift')), packages=OrderedSet([WheelPackage(u'file:///home/jsirois/dev/3rdparty/aurora/.pants.d/python-setup/resolved_requirements/CPython-2.7.10/thrift-0.9.3-cp27-none-linux_x86_64.whl'), SourcePackage(u'https://pypi.python.org/packages/source/t/thrift/thrift-0.9.2.tar.gz#md5=91f1c224c46a257bb428431943387dfd'), SourcePackage(u'https://pypi.python.org/packages/source/t/thrift/thrift-0.9.1.tar.gz#md5=8989a8a96b0e3a3380cfb89c44e172a6'), SourcePackage(u'https://pypi.python.org/packages/source/t/thrift/thrift-0.9.0.tar.gz#md5=600ad4d76bc87d3e0c8570557d60ae65'), SourcePackage(u'https://pypi.python.org/packages/source/t/thrift/thrift-0.8.0.tar.gz#md5=edc7e5311e2ae5593cd6ac06fdab3257')]), parent=None)
>>> round 2, resolvables:
    ...
    thrift==0.9.1
    ...
    thrift
    ...

08:14:17 00:21   [complete]
               FAILURE

...
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.RVbKze/install/lib/python2.7/site-packages/pants/backend/python/python_chroot.py", line 226, in dump
    distributions = self._resolve_multi(reqs_to_build, find_links)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.RVbKze/install/lib/python2.7/site-packages/pants/backend/python/python_chroot.py", line 265, in _resolve_multi
    cache_ttl=self._python_setup.resolver_cache_ttl)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.RVbKze/install/lib/python2.7/site-packages/pex/resolver.py", line 352, in resolve
    return resolver.resolve(resolvables_from_iterable(requirements, builder))
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.RVbKze/install/lib/python2.7/site-packages/pex/resolver.py", line 200, in resolve
    raise self.Error('Ambiguous resolvable: %s' % resolvable)

Exception message: Ambiguous resolvable: thrift

And NB:

$ unzip -qc .pants.d/python-setup/resolved_requirements/CPython-2.7.10/twitter.common.recordio-0.3.3-py2-none-any.whl twitter.common.recordio-0.3.3.dist-info/metadata.json | python -mjson.tool | grep -A8 run_requires
    "run_requires": [
        {
            "requires": [
                "thrift (==0.9.1)",
                "twitter.common.log (==0.3.3)"
            ]
        }
    ],
    "version": "0.3.3",

The above is an example from apache aurora where the top-level req. is thrift injected by PythonChroot and that picks thrift 0.9.3. The second round of resolution includes the thrift==0.9.1 requirement from twitter.common.recordio==0.3.3 and this leads to failure since 0.9.3 does not satisfy ==0.9.1.

At a high level, the 2 thrift requirements considered post-facto should lead to a solution, thrift,thrift==0.9.1 -> thrift==0.9.1. The issue as explained above is the requirements are resolved in layers with no compatibility smarts at the pex layer and no backtracking to try alternate solutions from prior layers with the current layer. There is a TODO in the pex resolver code to consider backtracking, but this is both of questionable utility (it would be different behavior from pip - which is the behavior folks expect) and a not insignificant change. As a result pants should probably just provide better control of the injected thrift requirement, possibly using semver constraints as the basis for the default instead of a wide-open dep. The thrift utility it uses knows the thrift compiler version number after all.

@jsirois jsirois added the bug label Nov 10, 2015
@jsirois jsirois self-assigned this Nov 10, 2015
@jsirois
Copy link
Contributor Author

jsirois commented Nov 10, 2015

For reference - the Aurora issue blocked on this fix: https://issues.apache.org/jira/browse/AURORA-1499

@jsirois
Copy link
Contributor Author

jsirois commented Nov 10, 2015

The lowest-impact 1st step to free up Aurora and other users in a similar bind is to simply respect direct thrift deps on python_thrift_library targets when performing code-gen dep injection. If the target has an explicit thrift requirement dep, use it; otherwise, fall back to thrift.

A second phase can plumb an option to change the fallback from thrift globally to a repo-wide default similar to how we have injected dep defaults configurable via options for ApacheThriftGen and ScroogeGen.

@jsirois
Copy link
Contributor Author

jsirois commented Nov 10, 2015

Actually - the low-impact step already works, its just not obvious - depends on non-local properties of PythonChroot wrt PythonThriftBuilder.

@jsirois
Copy link
Contributor Author

jsirois commented Nov 10, 2015

Stopping progress on this for now since Aurora is unblocked.

@jsirois jsirois changed the title Pants python thrift building allows no control over the thrift requirement. Pants python thrift building allows no control over the default thrift requirement. Nov 10, 2015
@jsirois jsirois added enhancement and removed bug labels Nov 10, 2015
@jsirois jsirois removed their assignment Nov 10, 2015
mateor added a commit to mateor/commons that referenced this issue May 12, 2016
Twitter commons was using a wrapper function around its python
dependencies to create python_requirement_library targets.
The python_requirements library is able to understand a pip
requirements.txt file and convert each entry into the
python_requirement_library target needed by Pants.

The python_requirements target uses the project name as a target
name. The original  wrapper was maintained for a few dependencies
when their name clashed with Pants BUILD file conventions.

I also removed the easy_install thrift, it looks like it was
put in place to service aurora, who was later unblocked.

Let me know if you think it should be returned:
pantsbuild/pants#2533

The purpose of this change is to enable a simple interface to convert
some python dependencies from being hardcoded to a single rev into
being able to be satisfied by a release range. I will submit
any dependency version changes separately.
@kwlzn kwlzn added the python label Jul 13, 2016
@paulcavallaro
Copy link

paulcavallaro commented Jul 29, 2016

Not sure this is quite the same problem, but it looks like on master (and well before) there should be a way to specify the thrift version to use for generating thrift.

https://github.com/pantsbuild/pants/blob/master/src/python/pants/binaries/thrift_binary.py#L34-L36

I was hoping it would be possible to use that, but I don't seem to find anywhere that that can be specified from pants when doing. The following doesn't work in pants 1.0.1

./pants gen.thrift --version=0.9.3 python/auster:example

Any help would be much appreciated. Thanks!

Edit:

Answering my own question, following up on some Aurora tickets, looks like adding

[thrift-binary]
version: 0.9.3

to my pants.ini upgrade version of thrift binary used. Takes a pants clean-all to take hold though.

@ericzundel
Copy link
Member

ericzundel commented Jul 30, 2016

Just to follow up, when I am looking for options, I use

./pants --help-all --help-advanced >/tmp/help.txt

There are some options that aren't set directly on the task when they are shared across multiple tasks (we call these subsystems)

Looking through the help dump you'll find the docs on the settings in the thrift-binary subsystem:

thrift-binary recursive options:

--[no-]thrift-binary-colors (default: True)
    Set whether log messages are displayed in color.
--thrift-binary-l=<str>, --thrift-binary-level=<str> (one of: [debug, info, warn] default: 'info')
    Set the logging level.
--thrift-binary-q, --[no-]thrift-binary-quiet (default: False)
    Squelches most console output.

thrift-binary advanced options:

--thrift-binary-cache-key-gen-version=<str> (default: '200')
    The cache key generation. Bump this to invalidate every artifact for a
    scope.
--thrift-binary-exclude-target-regexp=<regexp> (--thrift-binary-exclude-target-regexp=<regexp>) ..., --thrift-binary-exclude-target-regexp="[<regexp>, <regexp>, ...]", --thrift-binary-exclude-target-regexp="+[<regexp>, <regexp>, ...]" (default: [])
    Exclude targets that match these regexes.
--[no-]thrift-binary-fail-fast (default: False)
    Exit as quickly as possible on error, rather than attempting to continue to
    process the non-erroneous subset of the input.
--thrift-binary-max-subprocess-args=<int> (default: 100)
    Used to limit the number of arguments passed to some subprocesses by
    breaking the command up into multiple invocations.
--thrift-binary-supportdir=<str> (default: 'bin/thrift')
    Find thrift binaries under this dir.   Used as part of the path to lookup
    thetool with --binary-util-baseurls and --pants-bootstrapdir
--thrift-binary-version=<str> (default: '0.9.2')
    Thrift compiler version.   Used as part of the path to lookup thetool with
    --binary-util-baseurls and --pants-bootstrapdir

@kwlzn
Copy link
Member

kwlzn commented Aug 9, 2016

@paulcavallaro that controls the thrift binary version for places where we execute that directly to generate code.

the resolve failure highlighted above is separate and relates to the implicit injection of the python thrift package at chroot creation time, which is a necessary runtime dependency for python + thrift code.

so afaict from a quick skim, this static requirement string the crux of the issue: https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/python/thrift_builder.py#L35-L37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants