Export a lockfile in an ad-hoc JSON graph format. #2047

benjyw · 2023-02-05T00:43:15Z

It's useful to see the lockfile in graph form, for visualization, reasoning about how some transitive dep gets pulled in, etc.

This change emits a very simple JSON graph that other tools can process.

For example, this will turn a lockfile into a .png of a DOT graph:

echo "digraph {\
  $(pex lock export --format=graph lockfile.json  | \
  jq -r '.edges | to_entries[] | .key as $src | .value[] | "\"" + ($src) + "\" -> \"" + (.) + "\""') \
}"  | dot -Tpng

One use case is to allow Pants to display this data in the dependencies goal, which users have asked for, e.g., in pantsbuild/pants#12733

benjyw · 2023-02-05T00:52:52Z

This is a speculative change. I'm not convinced this captures all the nuances of lockfiles (e.g., right now it ignores extras, at the least it should probably list them in the vertex name). I want some feedback on the approach before beefing it up and writing tests.

jsirois

I really think Pants should define a model if not a format. Presumably, internally, it will map all resolve graph information into that model before rendering, allowing JVM, Python, etc backends to map to a model and the core graphing goal to do the rendering in 1 spot in a consistent way that even in the future allows for mixed-language graphs. Definition of that model would help ensure the Pex format defined here need not thrash, since Pants needs have been thought through and worked out as evidenced by the model.

jsirois · 2023-02-09T14:02:17Z

pex/cli/commands/lock.py

                    )
-                )
+        else:
+            vertices = set()


FYI: The current bit of Pex producing graphs is here: https://github.com/pantsbuild/pex/blob/97a2497e0938ece709310a03e7e41b5c26992952/pex/tools/commands/graph.py#L30-L83
That does take markers into account to good effect in the rendered graph, showing which requirements are not active in the present PEX being run by the present interpreter on the present machine. Of course the format is the dot adjacency list format and the side-band information about activation is stored in labels and colors.

If it's still the case that Pants doesn't want to define a format and Pex must my requirements are:

Embedding a version for the format: allow an escape hatch for backwards incompatible evolution (but with the intent to try hard to never actually use it)

Eventually merge the existing model / logic driving the tool example above with this: one kicks out dot, the other JSON.

jsirois · 2023-02-09T14:04:06Z

pex/resolve/locked_resolve.py

@@ -345,6 +345,7 @@ def create(
        target,  # type: Target
        direct_requirements,  # type: Iterable[Requirement]
        resolved_artifacts,  # type: Iterable[_ResolvedArtifact]
+        adjacency_list,  # type: Dict[Pin, Set[Pin]]


A LockedResolve already has all the information needed. There should be no need to add graph rendering-specific data or methods to it. I think this is just a sign a (stripped-down) model for exported graph data is needed.

If this is the case, we could derive the graph entirely in Pants, no? But it seemed to me (naively?) that this logic had to be here because request_resolve() is where we know who actually requires who, in the context of a target.

If this is the case, we could derive the graph entirely in Pants, no?

Yup. Just be ready to break / fix when Pex changes its private, undocumented format.

But it seemed to me (naively?) that this logic had to be here because request_resolve() is where we know who actually requires who, in the context of a target.

I think naively - maybe. So how will Pants use the graph info? If the lock is universal, which it is in Pants case, will ./pants dependencies ... include this info now and if so, what the heck will that pick for a target? Will it solve for a local interpreter that matches the constraints of all targets requested and use that if success and fail otherwise? If so and success, you already know the target up in Pants and can annotate the graph display with that info; i.e. this is the dependency graph when Python 3.7 is used, but the results may be different for 3.8 and 3.9 (and the repo is like Pants itself and supports >=3.7,<3.10).

I think Pants really ought to have what it wants to do sorted. The sorting can probably best be done today using the lock file json directly. Then, once sorted, a solid idea of the requirements for a stable PEX graph export format will be in hand and we can make sure it works for the not-pants cases, like non-universal locks with multiple resolves contained within and not just 1 universal resolve, etc.

jsirois · 2023-02-09T14:10:35Z

pex/cli/commands/lock.py

+                vertices.add(str(src_pin))
+                vertices.update(dst_pin_strs)
+                edges[str(src_pin)] = dst_pin_strs
+            graph = {


The single target selected above on 508-510 is controlled by command line flags like --python, --python-path, --interpreter-constraint, --platform and --complete-platform, but none of these need be present either. It seems like some node to store information about the rendered graph context would make sense; i.e.: for a universal lock, this would give the context for environment marker activation / deactivation assuming you include activation information in the graph and for a multi-lock lock file, it also indicates which lock is being talked about (say the CPython 3.7 locked resolve from a lock file containing 3 locks, 1 for 3.7, 1 for 3.8 and 1 for 3.9.

benjyw · 2023-02-10T05:20:06Z

Yeah, not a bad idea to make progress on Pants ingesting the data, to validate the format.

cosmicexplorer · 2023-07-24T18:44:23Z

Are you familiar with the pip install --report json output? I worked on that with @sbidoul (see pypa/pip#10748 and pypa/pip#10771). I have been planning to add that to pex for a while now.

Export a lockfile in an ad-hoc JSON graph format.

2271249

benjyw marked this pull request as draft February 5, 2023 00:43

benjyw requested a review from jsirois February 5, 2023 01:09

jsirois reviewed Feb 9, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export a lockfile in an ad-hoc JSON graph format. #2047

Export a lockfile in an ad-hoc JSON graph format. #2047

benjyw commented Feb 5, 2023 •

edited

Loading

benjyw commented Feb 5, 2023

jsirois left a comment

jsirois Feb 9, 2023 •

edited

Loading

jsirois Feb 9, 2023

benjyw Feb 10, 2023

jsirois Feb 10, 2023 •

edited

Loading

jsirois Feb 9, 2023

benjyw commented Feb 10, 2023

cosmicexplorer commented Jul 24, 2023

Export a lockfile in an ad-hoc JSON graph format. #2047

Are you sure you want to change the base?

Export a lockfile in an ad-hoc JSON graph format. #2047

Conversation

benjyw commented Feb 5, 2023 • edited Loading

benjyw commented Feb 5, 2023

jsirois left a comment

Choose a reason for hiding this comment

jsirois Feb 9, 2023 • edited Loading

Choose a reason for hiding this comment

jsirois Feb 9, 2023

Choose a reason for hiding this comment

benjyw Feb 10, 2023

Choose a reason for hiding this comment

jsirois Feb 10, 2023 • edited Loading

Choose a reason for hiding this comment

jsirois Feb 9, 2023

Choose a reason for hiding this comment

benjyw commented Feb 10, 2023

cosmicexplorer commented Jul 24, 2023

benjyw commented Feb 5, 2023 •

edited

Loading

jsirois Feb 9, 2023 •

edited

Loading

jsirois Feb 10, 2023 •

edited

Loading