Skip to content

Commit

Permalink
Support Python requirement target addrs in tool requirements. (#19014)
Browse files Browse the repository at this point in the history
Currently users can enumerate `requirements` for a Python tool, which
will cause Pants to use that requirement subset from the tool's lockfile. 
This supports resolving a tool from a larger lockfile, when that makes sense. 
If no requirements are provided the entire lockfile is used.

Two examples of when resolving a tool from a subset of a larger lockfile
is useful:
1. Resolving `pytest` as a tool at the same version as its runtime
  library without specifying it in two places.
2. Having a single lockfile for all tools, and user code, to be consumed
  by VSCode.

However, the value of this `requirements` option is typically redundant
with the input requirements used to generate the lockfile. 

This change allows `requirements` to specify addresses (or address
specs). This can be used to avoid the redundancy: You can point to some
python_requirement targets that are a subset of all those used to
generate the lockfile.

The implementation turns out to be reasonably neat, as it just pushes
some code that turns addresses into requirement strings to slightly 
later in the pex creation logic, and actually allowed us to reduce some
redundancy and trim some code.

This fixes #18816.
  • Loading branch information
benjyw authored May 17, 2023
1 parent 1026adc commit 9d845f0
Show file tree
Hide file tree
Showing 11 changed files with 186 additions and 68 deletions.
26 changes: 17 additions & 9 deletions docs/markdown/Python/python/python-lockfiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,20 +183,22 @@ It is strongly recommended that these tools be installed from a hermetic lockfil

The only time you need to think about this is if you want to customize the tool requirements that Pants uses. This might be the case if you want to modify the version of a tool or add extra requirements (for example, tool plugins).

If you want a tool to be installed from some resolve, instead of from the built-in lockfile, you set `install_from_resolve` on the tool's config section:
If you want a tool to be installed from some resolve, instead of from the built-in lockfile, you set `install_from_resolve` and `requirements` on the tool's config section:

```toml pants.toml
[python.resolves]
pytest = "3rdparty/python/pytest.lock"

[pytest]
install_from_resolve = "pytest"
install_from_resolve = "pytest" # Use this resolve's lockfiles.
requirements =["//3rdparty/python:pytest"] # Use these requirements from the lockfile.
```

Then set up the resolve's inputs:

```python 3rdparty/python/BUILD
python_requirements(
name="pytest",
source="pytest-requirements.txt",
resolve="pytest",
)
Expand All @@ -221,6 +223,12 @@ $ pants generate-lockfiles --resolve=pytest

Note that some tools, such as Flake8 and Bandit, must run on a Python interpreter that is compatible with the code they operate on. In this case you must ensure that the [interpreter constraints](#interpreter-constraints) for the tool's resolve are the same as those for the code in question.

### Invalidating tool lockfiles

Pants will verify that any requirements set in the `requirements` option are provided by the lockfile specified by install_from_resolve, and will error if not. This lets you ensure that you don't inadvertently use an older version of a tool if you update its requirements but forget to regenerate the lockfile.
The `requirements` option can either list requirement strings, such as `pytest==7.3.1`, or target addresses, such as `//3rdparty/python:pytest` (the `//` prefix tells Pants that these are target addresses). The latter is particularly useful as it allows you to avoid specifying the requirements redundantly in two places. Instead, the target can serve as both an input to the lockfile generator and as the requirements to verify.
Pants will only use the given `requirements` from the lockfile. If you don't set `requirements`, Pants will use the entire lockfile, and won't validate that it provides the desired tool at the desired version.

### Sharing lockfiles between tools and code

In some cases a tool also provides a runtime library. For example, `pytest` is run as a tool in a subprocess, but your tests can also `import pytest` to access testing functionality.
Expand All @@ -232,20 +240,20 @@ Rather than repeat the same requirement in two different resolves, you can point
install_from_resolve = python-default
```

Of course, you have to make sure that this resolve does in fact provide appropriate versions of the tool.
Of course, you have to ensure that this resolve does in fact provide appropriate versions of the tool.

By default, Pants will use the entire lockfile when installing the tool. This may cause Pants to invalidate the tool, and therefore its outputs, due to unrelated changes in the lockfile. You can instead enumerate a subset of requirements to be installed from a larger lockfile using the `requirements` option:
As above, you will want to point `requirements` to the subset of targets representing the tool's requirements, so that Pants can verify that the resolve provides them, and can use just the needed subset without unnecessary invalidation:

```toml pants.toml
[pytest]
install_from_resolve = python-default

requirements = [
"pytest",
"pytest-cov",
"pytest-xdist",
"pytest-myplugin",
"ipdb",
"//3rdparty/python#pytest",
"//3rdparty/python#pytest-cov",
"//3rdparty/python#pytest-xdist",
"//3rdparty/python#pytest-myplugin",
"//3rdparty/python#ipdb",
]
```

Expand Down
4 changes: 3 additions & 1 deletion pants.toml
Original file line number Diff line number Diff line change
Expand Up @@ -167,9 +167,9 @@ args = ["--wrap-summaries=100", "--wrap-descriptions=100"]

[flake8]
config = "build-support/flake8/.flake8"

source_plugins = ["build-support/flake8"]
install_from_resolve = "flake8"
requirements = ["//3rdparty/python:flake8"]

[shellcheck]
args = ["--external-sources"]
Expand All @@ -182,6 +182,7 @@ args = ["-i 2", "-ci", "-sr"]
args = ["--no-header"]
execution_slot_var = "TEST_EXECUTION_SLOT"
install_from_resolve = "pytest"
requirements = ["//3rdparty/python:pytest"]

[test]
extra_env_vars = [
Expand All @@ -199,6 +200,7 @@ timeout_default = 60
[mypy]
interpreter_constraints = [">=3.7,<3.10"]
install_from_resolve = "mypy"
requirements = ["//3rdparty/python:mypy"]


[coverage-py]
Expand Down
14 changes: 12 additions & 2 deletions src/python/pants/backend/python/subsystems/python_tool_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,14 @@ class PythonToolRequirementsBase(Subsystem):
If install_from_resolve is specified, install these requirements,
at the versions provided by the specified resolve's lockfile.
Values can be pip-style requirements (e.g., `tool` or `tool==1.2.3` or `tool>=1.2.3`),
or addresses of python_requirement targets (or targets that generate or depend on
python_requirement targets).
The lockfile will be validated against the requirements - if a lockfile doesn't
provide the requirement (at a suitable version, if the requirement specifies version
constraints) Pants will error.
If unspecified, install the entire lockfile.
"""
),
Expand Down Expand Up @@ -214,17 +222,19 @@ def pex_requirements(
If the tool supports lockfiles, the returned type will install from the lockfile rather than
`all_requirements`.
"""
description_of_origin = f"the requirements of the `{self.options_scope}` tool"
if self.install_from_resolve:
use_entire_lockfile = not self.requirements
return PexRequirements(
(*self.requirements, *extra_requirements),
from_superset=Resolve(self.install_from_resolve, use_entire_lockfile),
description_of_origin=description_of_origin,
)

requirements = (*self.all_requirements, *extra_requirements)

if not self.uses_lockfile:
return PexRequirements(requirements)
return PexRequirements(requirements, description_of_origin=description_of_origin)

hex_digest = calculate_invalidation_digest(requirements)

Expand Down Expand Up @@ -344,7 +354,7 @@ def to_pex_request(
return PexRequest(
output_filename=f"{self.options_scope.replace('-', '_')}.pex",
internal_only=True,
requirements=self.pex_requirements(extra_requirements=extra_requirements),
requirements=requirements,
interpreter_constraints=interpreter_constraints,
main=main,
sources=sources,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ def test_install_from_resolve_default() -> None:
pex_reqs = tool.pex_requirements()
assert isinstance(pex_reqs, PexRequirements)
assert pex_reqs.from_superset == Resolve("dummy_resolve", False)
assert pex_reqs.req_strings == FrozenOrderedSet(["bar", "baz", "foo"])
assert pex_reqs.req_strings_or_addrs == FrozenOrderedSet(["bar", "baz", "foo"])
2 changes: 1 addition & 1 deletion src/python/pants/backend/python/util_rules/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
python_sources(
overrides={
"local_dists_pep660.py": dict(dependencies=["./scripts/pep660_backend_wrapper.py"]),
}
},
)

python_tests(
Expand Down
98 changes: 80 additions & 18 deletions src/python/pants/backend/python/util_rules/pex.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from dataclasses import dataclass
from pathlib import PurePath
from textwrap import dedent # noqa: PNT20
from typing import Iterable, Iterator, Mapping, TypeVar
from typing import Iterable, Iterator, Mapping, Sequence, TypeVar

import packaging.specifiers
import packaging.version
Expand All @@ -24,6 +24,7 @@
PexLayout,
)
from pants.backend.python.target_types import PexPlatformsField as PythonPlatformsField
from pants.backend.python.target_types import PythonRequirementsField
from pants.backend.python.util_rules import pex_cli, pex_requirements
from pants.backend.python.util_rules.interpreter_constraints import InterpreterConstraints
from pants.backend.python.util_rules.pex_cli import PexCliProcess, PexPEX
Expand All @@ -48,10 +49,11 @@
ResolvePexConfigRequest,
validate_metadata,
)
from pants.build_graph.address import Address
from pants.core.target_types import FileSourceField
from pants.core.util_rules.environments import EnvironmentTarget
from pants.core.util_rules.system_binaries import BashBinary
from pants.engine.addresses import UnparsedAddressInputs
from pants.engine.addresses import Addresses, UnparsedAddressInputs
from pants.engine.collection import Collection, DeduplicatedCollection
from pants.engine.engine_aware import EngineAwareParameter
from pants.engine.environment import EnvironmentName
Expand All @@ -60,7 +62,14 @@
from pants.engine.internals.selectors import MultiGet
from pants.engine.process import Process, ProcessCacheScope, ProcessResult
from pants.engine.rules import Get, collect_rules, rule
from pants.engine.target import HydratedSources, HydrateSourcesRequest, SourcesField, Targets
from pants.engine.target import (
HydratedSources,
HydrateSourcesRequest,
SourcesField,
Targets,
TransitiveTargets,
TransitiveTargetsRequest,
)
from pants.engine.unions import UnionMembership, union
from pants.util.frozendict import FrozenDict
from pants.util.logging import LogLevel
Expand Down Expand Up @@ -453,6 +462,49 @@ class _BuildPexRequirementsSetup:
concurrency_available: int


@dataclass(frozen=True)
class ReqStrings:
req_strings: tuple[str, ...]


@rule
async def get_req_strings(pex_reqs: PexRequirements) -> ReqStrings:
addrs: list[Address] = []
specs: list[str] = []
req_strings: list[str] = []
for req_str_or_addr in pex_reqs.req_strings_or_addrs:
if isinstance(req_str_or_addr, Address):
addrs.append(req_str_or_addr)
else:
assert isinstance(req_str_or_addr, str)
# Require a `//` prefix, to distinguish address specs from
# local or VCS requirements.
if req_str_or_addr.startswith(os.path.sep * 2):
specs.append(req_str_or_addr)
else:
req_strings.append(req_str_or_addr)
if specs:
addrs_from_specs = await Get(
Addresses,
UnparsedAddressInputs(
specs,
owning_address=None,
description_of_origin=pex_reqs.description_of_origin,
),
)
addrs.extend(addrs_from_specs)
if addrs:
transitive_targets = await Get(TransitiveTargets, TransitiveTargetsRequest(addrs))
req_strings.extend(
PexRequirements.req_strings_from_requirement_fields(
tgt[PythonRequirementsField]
for tgt in transitive_targets.closure
if tgt.has_field(PythonRequirementsField)
)
)
return ReqStrings(tuple(sorted(req_strings)))


async def _setup_pex_requirements(
request: PexRequest, python_setup: PythonSetup
) -> _BuildPexRequirementsSetup:
Expand Down Expand Up @@ -509,16 +561,19 @@ async def _setup_pex_requirements(
[loaded_lockfile.lockfile_digest], argv, loaded_lockfile.requirement_estimate
)

assert isinstance(request.requirements, PexRequirements)
req_strings = (await Get(ReqStrings, PexRequirements, request.requirements)).req_strings

# TODO: This is not the best heuristic for available concurrency, since the
# requirements almost certainly have transitive deps which also need building, but it
# is better than using something hardcoded.
concurrency_available = len(request.requirements.req_strings)
concurrency_available = len(req_strings)

if isinstance(request.requirements.from_superset, Pex):
repository_pex = request.requirements.from_superset
return _BuildPexRequirementsSetup(
[repository_pex.digest],
[*request.requirements.req_strings, "--pex-repository", repository_pex.name],
[*req_strings, "--pex-repository", repository_pex.name],
concurrency_available,
)

Expand All @@ -528,15 +583,15 @@ async def _setup_pex_requirements(

# NB: This is also validated in the constructor.
assert loaded_lockfile.is_pex_native
if not request.requirements.req_strings:
if not req_strings:
return _BuildPexRequirementsSetup([], [], concurrency_available)

if loaded_lockfile.metadata:
validate_metadata(
loaded_lockfile.metadata,
request.interpreter_constraints,
loaded_lockfile.original_lockfile,
consumed_req_strings=request.requirements.req_strings,
consumed_req_strings=req_strings,
# Don't validate user requirements when subsetting a resolve, as Pex's
# validation during the subsetting is far more precise than our naive string
# comparison. For example, if a lockfile was generated with `foo==1.2.3`
Expand All @@ -550,7 +605,7 @@ async def _setup_pex_requirements(
return _BuildPexRequirementsSetup(
[loaded_lockfile.lockfile_digest],
[
*request.requirements.req_strings,
*req_strings,
"--lock",
loaded_lockfile.lockfile_path,
*pex_lock_resolver_args,
Expand All @@ -560,7 +615,7 @@ async def _setup_pex_requirements(

# We use pip to perform a normal resolve.
digests = []
argv = [*request.requirements.req_strings, *pip_resolver_args]
argv = [*req_strings, *pip_resolver_args]
if request.requirements.constraints_strings:
constraints_file = "__constraints.txt"
constraints_content = "\n".join(request.requirements.constraints_strings)
Expand Down Expand Up @@ -657,13 +712,18 @@ async def build_pex(
else:
output_directories = [request.output_filename]

req_strings = (
(await Get(ReqStrings, PexRequirements, request.requirements)).req_strings
if isinstance(request.requirements, PexRequirements)
else []
)
result = await Get(
ProcessResult,
PexCliProcess(
subcommand=(),
extra_args=argv,
additional_input_digest=merged_digest,
description=_build_pex_description(request, python_setup.resolves),
description=await _build_pex_description(request, req_strings, python_setup.resolves),
output_files=output_files,
output_directories=output_directories,
concurrency_available=requirements_setup.concurrency_available,
Expand Down Expand Up @@ -692,23 +752,25 @@ async def build_pex(
)


def _build_pex_description(request: PexRequest, resolve_to_lockfile: Mapping[str, str]) -> str:
async def _build_pex_description(
request: PexRequest, req_strings: Sequence[str], resolve_to_lockfile: Mapping[str, str]
) -> str:
if request.description:
return request.description

if isinstance(request.requirements, EntireLockfile):
lockfile = request.requirements.lockfile
desc_suffix = f"from {lockfile.url}"
else:
if not request.requirements.req_strings:
if not req_strings:
return f"Building {request.output_filename}"
elif isinstance(request.requirements.from_superset, Pex):
repo_pex = request.requirements.from_superset.name
return softwrap(
f"""
Extracting {pluralize(len(request.requirements.req_strings), 'requirement')}
Extracting {pluralize(len(req_strings), 'requirement')}
to build {request.output_filename} from {repo_pex}:
{', '.join(request.requirements.req_strings)}
{', '.join(req_strings)}
"""
)
elif isinstance(request.requirements.from_superset, Resolve):
Expand All @@ -718,16 +780,16 @@ def _build_pex_description(request: PexRequest, resolve_to_lockfile: Mapping[str
lockfile_path = resolve_to_lockfile.get(request.requirements.from_superset.name, "")
return softwrap(
f"""
Building {pluralize(len(request.requirements.req_strings), 'requirement')}
Building {pluralize(len(req_strings), 'requirement')}
for {request.output_filename} from the {lockfile_path} resolve:
{', '.join(request.requirements.req_strings)}
{', '.join(req_strings)}
"""
)
else:
desc_suffix = softwrap(
f"""
with {pluralize(len(request.requirements.req_strings), 'requirement')}:
{', '.join(request.requirements.req_strings)}
with {pluralize(len(req_strings), 'requirement')}:
{', '.join(req_strings)}
"""
)
return f"Building {request.output_filename} {desc_suffix}"
Expand Down
Loading

0 comments on commit 9d845f0

Please sign in to comment.