kedro-inspect

Overview

The single objective of kedro-inspect is to decouple the representation of a Kedro pipeline from its implementation and execution. This is useful for inspecting the pipeline without having access to the Kedro project or setting up dependencies that are only needed when running the pipeline.

Once we isolate the pipeline representation, we can use it for various purposes, such as analysing its structure, document it, or share it with others.

This representation can be saved to a static file (e.g. JSON). Then, the saved pipeline can be visualized using the Kedro-Viz package, or any other tool (written in any programming language) that can read the pipeline file format.

Inspection

The plan is to inspect the pipeline better, i.e. add more information to the pipeline representation over time, such as fine-grained type information or package dependencies per node.

This added information can be useful for various purposes, such as:

Generating documentation & schemas for the pipeline
Visualisation
Optimising pipeline execution
Generating a pipeline test suite

Compare to current Kedro functionality

Kedro provides serialisation of the pipeline. The crucial difference is that kedro-inspect does not require the Kedro project, hence can be used without setting up the project or its dependencies.

Usage

usage: kedro-inspect [-h] [-p PIPELINE] [-o OUTPUT] [--indent INDENT] project_path

Inspect a Kedro pipeline.

positional arguments:
  project_path          path to the Kedro project

optional arguments:
  -h, --help            show this help message and exit
  -p PIPELINE, --pipeline PIPELINE
                        name of the pipeline to inspect (default: __default__)
  -o OUTPUT, --output OUTPUT
                        path to the output file (default: None)
  --indent INDENT       indentation for JSON output (default: None)

Running kedro-inspect on spaceflights-pandas, we get a list of representations of the nodes in the pipeline. For example, the first node is represented as follows:

"nodes": [
        {
            "name": "preprocess_companies_node",
            "tags": [],
            "confirms": [],
            "namespace": null,
            "inputs": "companies",
            "outputs": "preprocessed_companies",
            "function": {
                "func": "spaceflights_pandas.pipelines.data_processing.nodes.preprocess_companies",
                "parameters": [
                    {
                        "name": "companies",
                        "kind": "POSITIONAL_OR_KEYWORD",
                        "type_hint": "pandas.core.frame.DataFrame"
                    }
                ],
                "return_value": "pandas.core.frame.DataFrame"
            },
            "param_to_input": {
                "companies": [
                    "companies"
                ]
            }
        },
        ...
]

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src/kedro_inspect		src/kedro_inspect
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kedro-inspect

Overview

Inspection

Compare to current Kedro functionality

Usage

About

Releases

Packages

Languages

AlpAribal/kedro-inspect

Folders and files

Latest commit

History

Repository files navigation

kedro-inspect

Overview

Inspection

Compare to current Kedro functionality

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages