Visualisation of Kedro Hooks #836

daBlesr · 2022-04-28T16:41:46Z

Description

Currently, I would like to get insight which of our nodes/datasets have a hook implementation.

Context

I have written an implementation that reads a yaml file with great expectations configurations (not using Kedro-Great) for specific nodes and datasets. A Hook class reads this config file and executes the respective validation rules on the input/output dataframes to either make the pipeline fail or output warnings. I'd like to get insight in the Kedro Visualisation which of the input/output datasets have great expectations rules applied to them.

Possible Implementation

I have too little knowledge on how kedro-viz works exactly, so forgive my ignorance: Export per node and dataset which hooks have an implementation, by maybe overriding a method on a Hook class that returns a list of nodes/datasets with some metadata (the specific GE rules per node/dataset).

Alternative Implementation

A specific GE implementation reading the GE config file, and appending the results to the saved json file. This is possible for me to write for myself, but then it would not be something generalisable.

Checklist

Append saved json file with some additional data on hooks x datasets/nodes
Visualize the datasets and nodes with a mark that it has Hook behaviour
Render the metadata for the dataset/node.

Possibly related: #194

tynandebold · 2022-05-04T15:31:59Z

@MerelTheisenQB do you have any thoughts here about this idea now that hooks are becoming more popular with the release of 0.18? Have you thought at all about what we could do with hooks on the Viz side?

cc @AntonyMilneQB @noklam

limdauto · 2022-05-04T15:42:11Z

This is going to be challenging. Hooks are not declarative. Its behaviour, e.g. which nodes to apply to, changes at runtime.

To be able to do this, we need runtime information, which isn't all bad. We can use this as an excuse to go realtime.

antonymilne · 2022-05-04T16:48:15Z

This is a really cool idea, but unfortunately I agree with @limdauto that this it sounds very tricky to do. I don't even see how runtime information would help here actually? e.g. if I have a hook

def before_node_run(node: Node):
    if node.name == "blah":
        do_stuff()

then how do I detect that it's acting on node "blah" and no others? Would be really interested in understanding how you think it might work @limdauto.

My first thought here is that this sort of customisation might somehow be enabled by something like #662. Here you would add some viz_widgets attribute to relevant entries in your data catalog (or maybe a new metadata.yml file that gets picked up by viz) saying which hooks are applied to which datasets. The advantage of this is that it's not GE hook-specific: you could use it to somehow inject custom metadata for any dataset. There are disadvantages too, e.g. need to maintain multiple yml files rather than working automatically in realtime, how to extend to work for nodes.

daBlesr · 2022-05-04T17:11:40Z

An idea to support this feature would be to extend the Hook class with some visualisation specific logic. A draft of what this could look like:

class DataValidationHook():

    @hook_impl
    def before_node_run(self, inputs: Dict[str, Any]) -> None:
        ...

    @viz_impl
    def viz_behaviour(self) -> HookVizBehaviour:
        return DataValidationVizBehaviour()

class DataValidationVizBehaviour(HookVizBehaviour):

    @hook_impl
    def before_node_run(self, node: Node, inputs: Dict[str, Any]) -> None:
        if some_situation:
            self.add_node(node, meta=some_meta_info)

tynandebold · 2022-10-31T10:57:20Z

Hey @daBlesr, curious what your main use case is for this? I see from your example code you're doing something with data validation and perhaps want to see the outcome of that. Is there anything else you're doing that you'd like to see visualized?

daBlesr added the Issue: Feature Request label Apr 28, 2022

tynandebold added the Design: Research label May 4, 2022

tynandebold closed this as completed Jan 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visualisation of Kedro Hooks #836

Visualisation of Kedro Hooks #836

daBlesr commented Apr 28, 2022

tynandebold commented May 4, 2022

limdauto commented May 4, 2022

antonymilne commented May 4, 2022 •

edited

Loading

daBlesr commented May 4, 2022 •

edited

Loading

tynandebold commented Oct 31, 2022

Visualisation of Kedro Hooks #836

Visualisation of Kedro Hooks #836

Comments

daBlesr commented Apr 28, 2022

Description

Context

Possible Implementation

Alternative Implementation

Checklist

tynandebold commented May 4, 2022

limdauto commented May 4, 2022

antonymilne commented May 4, 2022 • edited Loading

daBlesr commented May 4, 2022 • edited Loading

tynandebold commented Oct 31, 2022

antonymilne commented May 4, 2022 •

edited

Loading

daBlesr commented May 4, 2022 •

edited

Loading