Aggregate Ruff configuration data.
See astral-sh/ruff#3365.
Do e.g. pip install -e .
to install the package in a virtualenv.
Use the [histogram]
extra to also get, well, histograms.
The major workflow is:
- Find files to scan.
ruff-usage-aggregate scan-github-search
(or other data sources, to be implemented) to find possible candidate TOML files.- To use this, you'll need to set the
RUA_GITHUB_TOKEN
environment variable to a GitHub API token. You can also place it in a file called.env
in the working directory. - It will output a
github_search_*
JSONL file that can be parsed later.
- To use this, you'll need to set the
- There is an "unofficial" suite of scraper scripts for the Ruff repository's GitHub dependents page in
aux/
; "unofficial" because it's not using the API and may break at any time. (You can still trymake scrape-dependents
.) - There's also a
data/known-github-tomls.jsonl
file in the repository, which contains a list of known TOML files. - You can use the
ruff-usage-aggregate combine
command to combine github search files, CSV and JSONL files to a newknown-github-tomls.jsonl
file.
- Download the files.
- Run e.g.
ruff-usage-aggregate download-tomls -o tomls/ < data/known-github-tomls.jsonl
to download TOML files to thetomls/
directory.
- Run e.g.
- Aggregate data from downloaded files.
ruff-usage-aggregate scan-tomls -i tomls -o json
will dump aggregate data to stdout in JSON format.ruff-usage-aggregate scan-tomls -i tomls -o markdown
will dump aggregate data to stdout in a pre-formatted Markdown format.
ruff-usage-aggregate
is distributed under the terms of the MIT license.