Skip to content

Commit

Permalink
docs: export dependency graph as adjacency list
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexTereshenkov committed Feb 17, 2024
1 parent d782c03 commit 752cfaf
Showing 1 changed file with 66 additions and 0 deletions.
66 changes: 66 additions & 0 deletions docs/docs/using-pants/project-introspection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,72 @@ To include the original target itself, use `--closed`:
helloworld/main.py:lib
```

## Export dependency graph

Both `dependencies` and `dependents` goals have the `--format` option allowing you to export data in multiple formats.
Exporting information about the dependencies and dependents in JSON format will produce the
[adjacency list](https://en.wikipedia.org/wiki/Adjacency_list) of your dependency graph:

```bash
$ pants dependencies --format=json \
helloworld/greet/greeting.py \
helloworld/translator/translator_test.py

{
"helloworld/greet/greeting.py:lib": [
"//:reqs#setuptools",
"//:reqs#types-setuptools",
"helloworld/greet:translations",
"helloworld/translator/translator.py:lib"
],
"helloworld/translator/translator_test.py:tests": [
"//:reqs#pytest",
"helloworld/translator/translator.py:lib"
]
}
```

This has various applications, and you could analyze, visualize, and process the data further. Sometimes, a fairly
straightforward `jq` query would suffice, but for anything more complex, it may make sense to write a small program
to process the exported graph. For instance, you could:

* find tests with most transitive dependencies

```bash
$ pants dependencies --filter-target-type=python_test --format=json :: \
| jq -r 'to_entries[] | "\(.key)\t\(.value | length)"' \
| sort -k2
```

* find build targets that no one depends on

```bash
$ pants dependents --filter-target-type=resource --format=json :: \
jq -r 'to_entries[] | select(.value | length == 0)'
```

* find project source files that transitively lead to most tests

```python
# depgraph.py
import json

with open("data.json") as fh:
data = json.load(fh)

for source, dependents in data.items():
print(source, len([d for d in dependents if d.startswith("tests/")]))
```

```bash
$ pants pants dependents --transitive --format=json cheeseshop:: > data.json
$ python3 depgraph.py | sort -k2
```

For more sophisticated graph querying, you may want to look into graph libraries such as [`networkx`](https://networkx.org/).
In a larger repository, it may make sense to track the health of the dependency graph and use the output
of the graph export to identify segments that would benefit from refactoring.

## `filedeps` - find which files a target owns

`filedeps` outputs all of the files belonging to a target, based on its `sources` field.
Expand Down

0 comments on commit 752cfaf

Please sign in to comment.