Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] Add experimental per-tool lockfiles with Docformatter as an example #12346

Merged
merged 6 commits into from
Jul 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pants.toml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ interpreter_constraints = [">=3.7,<3.10"]

[docformatter]
args = ["--wrap-summaries=100", "--wrap-descriptions=100"]
experimental_lockfile = "src/python/pants/backend/python/lint/docformatter/lockfile.txt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a tangent (because this needs to be nested more deeply because we re-generate it when we change the default), but: there are lots of potential conventions to be formed, but something like 3rdparty/lock/${resolve_name}.*.txt is likely to make the most sense for end users.

For performance reasons, end users are really going to want to share resolves within the repo rather than leaning into a pattern where every single binary has their own unique resolve. Since we're not anticipating "hundreds" of them (a handful, most likely... at most a dozen?) centralizing them seems like it encourages that.


[flake8]
config = "build-support/flake8/.flake8"
Expand Down
143 changes: 135 additions & 8 deletions src/python/pants/backend/experimental/python/lockfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,20 @@
from __future__ import annotations

import logging
from dataclasses import dataclass
from typing import cast

from pants.backend.python.subsystems.python_tool_base import PythonToolBase
from pants.backend.python.target_types import ConsoleScript, PythonRequirementsField
from pants.backend.python.util_rules.interpreter_constraints import InterpreterConstraints
from pants.backend.python.util_rules.pex import PexRequest, PexRequirements, VenvPex, VenvPexProcess
from pants.engine.addresses import Addresses
from pants.engine.fs import CreateDigest, Digest, FileContent, Workspace
from pants.engine.fs import CreateDigest, Digest, FileContent, MergeDigests, Workspace
from pants.engine.goal import Goal, GoalSubsystem
from pants.engine.process import ProcessResult
from pants.engine.rules import Get, MultiGet, collect_rules, goal_rule
from pants.engine.rules import Get, MultiGet, collect_rules, goal_rule, rule
from pants.engine.target import TransitiveTargets, TransitiveTargetsRequest
from pants.engine.unions import UnionMembership, union
from pants.python.python_setup import PythonSetup
from pants.util.strutil import pluralize

Expand Down Expand Up @@ -46,6 +48,11 @@ def lockfile_dest(self) -> str:
return cast(str, self.options.lockfile_dest)


# --------------------------------------------------------------------------------------
# User lockfiles
# --------------------------------------------------------------------------------------


class LockSubsystem(GoalSubsystem):
name = "lock"
help = "Generate a lockfile."
Expand Down Expand Up @@ -104,8 +111,12 @@ async def generate_lockfile(
return LockGoal(exit_code=0)

input_requirements_get = Get(
Digest, CreateDigest([FileContent("requirements.in", "\n".join(reqs).encode())])
Digest, CreateDigest([FileContent("requirements.in", "\n".join(reqs.req_strings).encode())])
)
# TODO: Figure out which interpreter constraints to use...Likely get it from the
# transitive closure. When we're doing a single global lockfile, it's fine to do that,
# but we need to figure out how this will work with multiple resolves.
Comment on lines +116 to +118
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same set of @rules that requests a resolve will eventually also be triggering the sideeffect of creating the lockfile. It has all of the context needed: exactly which constraints/platforms to use, etc.

But that actually suggests an interesting middle ground. Perhaps what we could do (until that codepath can trigger its own sideeffect via #12014), would be to have the warning/error that we render when a lockfile is stale actually render the exact command to run to regenerate the lockfile. Something like:

[ERROR]: The resolve for ${named_resolve} (at $lockfile) was stale. To lock the requirements, please run:
    ./pants lock --interpreter-constraints="${constraints_requested_in_relative_codepath}" $lockfile

...and then when we're able to do it automatically via #12014, we can change the default from warn/error to "regenerate".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or it could even specify the entire set of input requirements as an argument...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I'm envisioning the same thing.

Copy link
Member

@stuhood stuhood Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not blocking feedback, but this would pretty fundamentally affect the design, and is the reason I didn't immediately shipit: with this design, you don't need a separate tool-lock command: instead, when you run pants without a lockfile specified, it will warn/error with the precise command to run to generate the lockfile that will make the warning go away.

If you think you want to land this as is and follow up, that's fine though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah to be clear, I want the tool-lock goal to go away and be unified with the lock goal. tool-lock is only to make forward progress.

interpreter_constraints = InterpreterConstraints(python_setup.interpreter_constraints)
# TODO(#12314): Figure out named_caches for pip-tools. The best would be to share the cache
# between Pex and Pip. Next best is a dedicated named_cache.
pip_compile_get = Get(
Expand All @@ -114,11 +125,11 @@ async def generate_lockfile(
output_filename="pip_compile.pex",
internal_only=True,
requirements=PexRequirements(pip_tools_subsystem.all_requirements),
# TODO: Figure out which interpreter constraints to use...Likely get it from the
# transitive closure. When we're doing a single global lockfile, it's fine to do that,
# but we need to figure out how this will work with multiple resolves.
interpreter_constraints=InterpreterConstraints(python_setup.interpreter_constraints),
interpreter_constraints=interpreter_constraints,
main=pip_tools_subsystem.main,
description=(
f"Building pip_compile.pex with interpreter constraints: {interpreter_constraints}"
),
),
)
input_requirements, pip_compile = await MultiGet(input_requirements_get, pip_compile_get)
Expand All @@ -129,7 +140,8 @@ async def generate_lockfile(
VenvPexProcess(
pip_compile,
description=(
f"Generate lockfile for {pluralize(len(reqs), 'requirements')}: {', '.join(reqs)}"
f"Generate lockfile for {pluralize(len(reqs.req_strings), 'requirements')}: "
f"{', '.join(reqs.req_strings)}"
),
# TODO(#12314): Wire up all the pip options like indexes.
argv=[
Expand All @@ -152,5 +164,120 @@ async def generate_lockfile(
return LockGoal(exit_code=0)


# --------------------------------------------------------------------------------------
# Tool lockfiles
# --------------------------------------------------------------------------------------


@dataclass(frozen=True)
class PythonToolLockfile:
digest: Digest
tool_name: str
path: str


@union
class PythonToolLockfileSentinel:
pass


@dataclass(frozen=True)
class PythonToolLockfileRequest:
tool_name: str
lockfile_path: str
requirements: tuple[str, ...]
interpreter_constraints: tuple[str, ...]


# TODO(#12314): Unify this goal with `lock` once we figure out how to unify the semantics,
# particularly w/ CLI specs. This is a separate goal only to facilitate progress.
class ToolLockSubsystem(GoalSubsystem):
name = "tool-lock"
help = "Generate a lockfile for a Python tool."
required_union_implementations = (PythonToolLockfileSentinel,)


class ToolLockGoal(Goal):
subsystem_cls = ToolLockSubsystem


@rule
async def generate_tool_lockfile(
request: PythonToolLockfileRequest, pip_tools_subsystem: PipToolsSubsystem
) -> PythonToolLockfile:
input_requirements_get = Get(
Digest,
CreateDigest(
[
FileContent(
f"requirements_{request.tool_name}.in", "\n".join(request.requirements).encode()
)
]
),
)
interpreter_constraints = InterpreterConstraints(request.interpreter_constraints)
pip_compile_get = Get(
VenvPex,
PexRequest(
output_filename="pip_compile.pex",
internal_only=True,
requirements=PexRequirements(pip_tools_subsystem.all_requirements),
interpreter_constraints=interpreter_constraints,
main=pip_tools_subsystem.main,
description=(
f"Building pip_compile.pex with interpreter constraints: {interpreter_constraints}"
),
),
)
input_requirements, pip_compile = await MultiGet(input_requirements_get, pip_compile_get)

result = await Get(
ProcessResult,
VenvPexProcess(
pip_compile,
description=f"Generate lockfile for {request.tool_name}",
argv=[
f"requirements_{request.tool_name}.in",
"--generate-hashes",
f"--output-file={request.lockfile_path}",
# NB: This allows pinning setuptools et al, which we must do. This will become
# the default in a future version of pip-tools.
"--allow-unsafe",
],
input_digest=input_requirements,
output_files=(request.lockfile_path,),
),
)

return PythonToolLockfile(result.output_digest, request.tool_name, request.lockfile_path)


@goal_rule
async def generate_all_tool_lockfiles(
workspace: Workspace,
union_membership: UnionMembership,
) -> ToolLockGoal:
# TODO(#12314): Add logic to inspect the Specs and generate for only relevant lockfiles. For
# now, we generate for all tools.
requests = await MultiGet(
Get(PythonToolLockfileRequest, PythonToolLockfileSentinel, sentinel())
for sentinel in union_membership.get(PythonToolLockfileSentinel)
)
if not requests:
return ToolLockGoal(exit_code=0)

results = await MultiGet(
Get(PythonToolLockfile, PythonToolLockfileRequest, req)
for req in requests
if req.lockfile_path not in {"<none>", "<default>"}
)
merged_digest = await Get(Digest, MergeDigests(res.digest for res in results))
workspace.write_digest(merged_digest)
for result in results:
logger.info(f"Wrote lockfile for {result.tool_name} to {result.path}")

return ToolLockGoal(exit_code=0)


def rules():
return collect_rules()
2 changes: 1 addition & 1 deletion src/python/pants/backend/python/goals/setup_py.py
Original file line number Diff line number Diff line change
Expand Up @@ -695,7 +695,7 @@ async def get_requirements(
for tgt in direct_deps_with_excl
if tgt.has_field(PythonRequirementsField)
)
req_strs = list(reqs)
req_strs = list(reqs.req_strings)

# Add the requirements on any exported targets on which we depend.
kwargs_for_exported_targets_we_depend_on = await MultiGet(
Expand Down
3 changes: 2 additions & 1 deletion src/python/pants/backend/python/lint/docformatter/BUILD
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Copyright 2020 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

python_library()
python_library(dependencies=[":lockfile"])
python_tests(name="tests", timeout=120)
resources(name="lockfile", sources=["lockfile.txt"])
12 changes: 12 additions & 0 deletions src/python/pants/backend/python/lint/docformatter/lockfile.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#
# This file is autogenerated by pip-compile with python 3.6
# To update, run:
#
# pip-compile --allow-unsafe --generate-hashes --output-file=src/python/pants/backend/python/lint/docformatter/lockfile.txt requirements_docformatter.in
#
docformatter==1.4 \
--hash=sha256:064e6d81f04ac96bc0d176cbaae953a0332482b22d3ad70d47c8a7f2732eef6f
# via -r requirements_docformatter.in
untokenize==0.1.1 \
--hash=sha256:3865dbbbb8efb4bb5eaa72f1be7f3e0be00ea8b7f125c69cbd1f5fda926f37a2
# via docformatter
4 changes: 2 additions & 2 deletions src/python/pants/backend/python/lint/docformatter/register.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
"""

from pants.backend.python.lint import python_fmt
from pants.backend.python.lint.docformatter import skip_field
from pants.backend.python.lint.docformatter import skip_field, subsystem
from pants.backend.python.lint.docformatter.rules import rules as docformatter_rules


def rules():
return (*docformatter_rules(), *python_fmt.rules(), *skip_field.rules())
return (*docformatter_rules(), *python_fmt.rules(), *skip_field.rules(), *subsystem.rules())
32 changes: 25 additions & 7 deletions src/python/pants/backend/python/lint/docformatter/rules.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Copyright 2020 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

import importlib.resources
from dataclasses import dataclass
from typing import Tuple

Expand All @@ -14,7 +15,7 @@
from pants.core.goals.fmt import FmtResult
from pants.core.goals.lint import LintRequest, LintResult, LintResults
from pants.core.util_rules.source_files import SourceFiles, SourceFilesRequest
from pants.engine.fs import Digest
from pants.engine.fs import Digest, FileContent
from pants.engine.process import FallibleProcessResult, Process, ProcessResult
from pants.engine.rules import Get, MultiGet, collect_rules, rule
from pants.engine.target import FieldSet, Target
Expand Down Expand Up @@ -58,23 +59,40 @@ def generate_args(

@rule(level=LogLevel.DEBUG)
async def setup_docformatter(setup_request: SetupRequest, docformatter: Docformatter) -> Setup:
docformatter_pex_request = Get(
if docformatter.lockfile == "<none>":
requirements = PexRequirements(docformatter.all_requirements)
elif docformatter.lockfile == "<default>":
requirements = PexRequirements(
file_content=FileContent(
"docformatter_default_lockfile.txt",
importlib.resources.read_binary(
"pants.backend.python.lint.docformatter", "lockfile.txt"
),
)
)
else:
requirements = PexRequirements(
file_path=docformatter.lockfile,
file_path_description_of_origin="the option `[docformatter].experimental_lockfile`",
)
Comment on lines +62 to +77
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than being something that a tool should explicitly request, this feels like an argument that should be passed down via PexRequest. Maybe a Resolve(name=.., file=.., default=..) parameter, which PexRequest would switch on behavior with? An end-user resolve would have no default, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe! I want to be careful to not factor too much until we have better insight on what this will all look like.


docformatter_pex_get = Get(
VenvPex,
PexRequest(
output_filename="docformatter.pex",
internal_only=True,
requirements=PexRequirements(docformatter.all_requirements),
interpreter_constraints=InterpreterConstraints(docformatter.interpreter_constraints),
main=docformatter.main,
description="Build docformatter.pex",
requirements=requirements,
is_lockfile=docformatter.lockfile != "<none>",
),
)

source_files_request = Get(
source_files_get = Get(
SourceFiles,
SourceFilesRequest(field_set.sources for field_set in setup_request.request.field_sets),
)

source_files, docformatter_pex = await MultiGet(source_files_request, docformatter_pex_request)
source_files, docformatter_pex = await MultiGet(source_files_get, docformatter_pex_get)

source_files_snapshot = (
source_files.snapshot
Expand Down
50 changes: 50 additions & 0 deletions src/python/pants/backend/python/lint/docformatter/subsystem.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,14 @@

from typing import Tuple, cast

from pants.backend.experimental.python.lockfile import (
PythonToolLockfileRequest,
PythonToolLockfileSentinel,
)
from pants.backend.python.subsystems.python_tool_base import PythonToolBase
from pants.backend.python.target_types import ConsoleScript
from pants.engine.rules import collect_rules, rule
from pants.engine.unions import UnionRule
from pants.option.custom_types import shell_str


Expand Down Expand Up @@ -38,6 +44,26 @@ def register_options(cls, register):
f'`--{cls.options_scope}-args="--wrap-summaries=100 --pre-summary-newline"`.'
),
)
register(
"--experimental-lockfile",
type=str,
default="<none>",
advanced=True,
help=(
"Path to a lockfile used for installing the tool.\n\n"
"Set to the string '<default>' to use a lockfile provided by "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a different special string, like <builtin>?

"Pants, so long as you have not changed the `--version`, `--extra-requirements`, "
"and `--interpreter-constraints` options. See {} for the default lockfile "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a GitHub URL once I add a util to generate that in a follow up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to literally load the resource as the default value of a field. That would allow you to both set it and override it in the option, without putting it on disk...? Not sure if that's actually great, but.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and override it in the option

Yeah I thought about that, but didn't like it from an ergonomics perspective:

  • these files can be huge. No way you'll feasibly set it in CLI or env var, and setting in pants.toml is clunky.
  • Pollutes ./pants help because the value can be so big.

Instead, I think John's idea of embedding a resource is a good way to get it To Just Work by default, while still being flexible to overriding.

"contents.\n\n"
"Set to the string '<none>' to opt out of using a lockfile. We do not recommend "
"this, as lockfiles are essential for reproducible builds.\n\n"
"To use a custom lockfile, set this option to a file path relative to the build "
"root, then activate the backend_package `pants.backend.experimental.python` and "
"run `./pants tool-lock`.\n\n"
"This option is experimental and will likely change. It does not follow the normal "
"deprecation cycle."
),
)

@property
def skip(self) -> bool:
Expand All @@ -46,3 +72,27 @@ def skip(self) -> bool:
@property
def args(self) -> Tuple[str, ...]:
return tuple(self.options.args)

@property
def lockfile(self) -> str:
return cast(str, self.options.experimental_lockfile)


class DocformatterLockfileSentinel(PythonToolLockfileSentinel):
pass


@rule
def setup_lockfile_request(
_: DocformatterLockfileSentinel, docformatter: Docformatter
) -> PythonToolLockfileRequest:
return PythonToolLockfileRequest(
tool_name=docformatter.options_scope,
lockfile_path=docformatter.lockfile,
requirements=docformatter.all_requirements,
interpreter_constraints=docformatter.interpreter_constraints,
)


def rules():
return (*collect_rules(), UnionRule(PythonToolLockfileSentinel, DocformatterLockfileSentinel))
4 changes: 3 additions & 1 deletion src/python/pants/backend/python/lint/pylint/rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,9 @@ async def pylint_lint_partition(partition: PylintPartition, pylint: Pylint) -> L
PexRequest(
output_filename="pylint.pex",
internal_only=True,
requirements=PexRequirements([*pylint.all_requirements, *plugin_requirements]),
requirements=PexRequirements(
[*pylint.all_requirements, *plugin_requirements.req_strings]
),
interpreter_constraints=partition.interpreter_constraints,
),
)
Expand Down
Loading