[KED-1951] Backend to send modular pipelines to kedro viz #394

merelcht · 2021-03-12T09:38:25Z

Description

As part of the work on showing modular pipelines in viz.

Development notes

Added modular pipelines as top level key as well as to individual nodes (both "node"-nodes and parameter/dataset nodes).

Merging it into feature branch feature/modular-pipelines-complete, where the FE changes will be merged as well. Then we'll do a full check and make sure it all works before creating a final PR to main.

QA notes

Update test setup and animals.mock.json to contain modular pipeline data.

FYI: JS tests will fail now I've updated the test data. @studioswong will address these tests once this PR has been merged to the main feature branch feature/modular-pipelines-complete

Todo:

Handle modular pipelines for parameter nodes. Will be done in a separate PR.

Checklist

Read the contributing guidelines
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added new entries to the RELEASE.md file
Added tests to cover my changes

Legal notice

I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":
I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorised to submit this contribution on behalf of the original creator(s) or their licensees.
I certify that the use of this contribution as authorised by the Apache 2.0 license does not violate the intellectual property rights of anyone else.

idanov

Currently the code combines all modular pipelines from all registered pipelines. However for the proper functioning of the frontend, we need the modular pipelines only from the currently selected registered pipeline.

package/kedro_viz/server.py

…ine + update test data

…ng for clarity

package/kedro_viz/server.py

limdauto

A few comments. I think we are nearly there. Could we add an unit tests for _expand_namespace to document its behaviour please? Also if we make the logic to prettify a modular pipeline name into a function to unit test it as well, that'd be helpful.

package/kedro_viz/server.py

package/tests/test_server.py

idanov

Glad to see we're getting there and almost ready! There's two main things to address here:

In the current implementation, the membership of a dataset is determined by the nodes using it, which is incorrect. Modular pipeline membership is determined entirely from the name of the dataset, otherwise it will produce incorrect result.
Currently the modular pipelines for the whole pipeline do not include all expanded pipelines. Not sure if that is added to the requirement in the ticket and we can probably get away for now without adding it, but the most correct definition of the top level modular_pipelines field should be a superset of all modular_pipelines fields of all the nodes in the pipeline. That's actually a very easy way to implement it as well, so maybe worth looking into doing it this way. Could be in a separate PR though, since it's not critical :)

Well done 👏 on actually prompting a refactoring for the code here and highlighting how confusing it is already.

package/kedro_viz/server.py

package/tests/test_server.py

package/kedro_viz/server.py

merelcht · 2021-03-17T15:12:05Z

Glad to see we're getting there and almost ready! There's two main things to address here:

In the current implementation, the membership of a dataset is determined by the nodes using it, which is incorrect. Modular pipeline membership is determined entirely from the name of the dataset, otherwise it will produce incorrect result.

I'm definitely missing something here. How do you get the modular pipeline from the name of the dataset? How does a user add that? 🤯

Currently the modular pipelines for the whole pipeline do not include all expanded pipelines. Not sure if that is added to the requirement in the ticket and we can probably get away for now without adding it, but the most correct definition of the top level modular_pipelines field should be a superset of all modular_pipelines fields of all the nodes in the pipeline. That's actually a very easy way to implement it as well, so maybe worth looking into doing it this way. Could be in a separate PR though, since it's not critical :)

I might be misunderstanding something, but the modular pipelines added for the full pipeline does include all extended pipelines. If you look at the animals.mock.json file you'll see that spread around the nodes we have ["pipeline1", "pipeline1.data_engineering"] and ["pipeline2", "pipeline2.data_science"] and for the full pipeline we have:

"modular_pipelines": [
    {
      "id": "pipeline1",
      "name": "Pipeline1"
    },
    {
      "id": "pipeline1.data_engineering",
      "name": "Data Engineering"
    },
    {
      "id": "pipeline2",
      "name": "Pipeline2"
    },
    {
      "id": "pipeline2.data_science",
      "name": "Data Science"
    }
  ],

merelcht · 2021-03-19T11:41:29Z

src/utils/data/animals.mock.json

-      "source": "e27376a9",
-      "target": "1769e230"
+      "id": "pipeline2.data_science",
+      "name": "Data Science"


@limdauto Added pipeline2.data_science and pipeline1.data_science to show the "nameclash" case.

merelcht · 2021-03-19T11:42:06Z

package/tests/test_server.py

 def get_pipelines():
    return {
        "de": de_pipeline(),
-        "ds": ds_pipeline(),
+        "pre_ds": pipeline(pre_ds_pipeline(), namespace="pipeline1.data_science"),
+        "ds": pipeline(ds_pipeline(), namespace="pipeline2.data_science"),


Changed the setup to include modular pipelines properly.

package/kedro_viz/server.py

limdauto

Apart from one last comment around dataset namespace, this is it :D Thank you so much for pushing through!

limdauto · 2021-03-19T14:27:07Z

package/kedro_viz/server.py

                "pipelines": [pipeline_key],
+                "modular_pipelines": dataset_modular_pipelines,


I think we discussed that people can technically name their dataset foo.bar with foo & bar not being modular pipeline, so we need to check whether they are in the valid modular pipelines list (which comes from the nodes)

Ah okay, thanks for flagging that! I had that in mind for parameters, but will check for it for datasets as well 👍

~~Done in 50ca16c~~
Removed this again as agreed in standup.

package/kedro_viz/server.py

idanov

Well done 🎉 Super curious to see this served to Viz. I think we can start experimenting even before we have the parameters work done.

package/kedro_viz/server.py

…oncept (#421) * [KED-1951] Backend to send modular pipelines to kedro viz (#394) * WIP add modular pipelines * Expose modular pipelines and add testing data * Lint * undo push of package-lock * Revert package lock * Fix lint * Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data * Address comments on PR * Cleanup and lint * Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity * Temporarily skip js tests to make sure all python stuff works * Put back JS tests for CI * First iteration of addressing comments on PR * Correctly deduce modular pipeline from dataset * Add all modular pipelines for all nodes * Check that dataset namespace is actually a modular pipeline * Undo check if namespace is modular pipeline * updated FE tests to match new animals dataset (#401) * Add modular pipelines to parameter nodes (#402) * WIP add modular pipelines * Expose modular pipelines and add testing data * Lint * undo push of package-lock * Revert package lock * Revert package lock * Fix lint * Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data * Address comments on PR * Cleanup and lint * Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity * Temporarily skip js tests to make sure all python stuff works * Put back JS tests for CI * First iteration of addressing comments on PR * Correctly deduce modular pipeline from dataset * Add all modular pipelines for all nodes * Check that dataset namespace is actually a modular pipeline * Undo check if namespace is modular pipeline * Add modular pipelines for parameter nodes * Verify if modular pipelines listed are actual modular piplines * Temporarily disable JS tests to make sure other steps pass * Put back JS tests, all other checks pass ✅ * Update package/kedro_viz/server.py Co-authored-by: Ivan Danov <[email protected]> * Address review comments * Treat dataset node modular pipelines the same as task node modular pipelines. Co-authored-by: Ivan Danov <[email protected]> * Send correct format for modular pipelines from /pipelines endpoint (#408) * Modular pipeline tags concept (#410) * set up store functions for incoming modular pipeline data * added additional test for modular pipeline * set up flag for modular pipeline * set up selector to get list of modular pipelines and nodes * add ncheck for node modular pipeline data in selector * set up modular pipeline on sidebar * refactor node-list to enable change of both modular pipeline and tags * further setup reducer and node selection * added item label check * hide modular pipeline features behind a flag * fix failing tests and set up new data set * added tests for modular pipeline actions * further revisions * enable indented styling for lazy list * update readme to explain modular pipeline and tag filtering behaviour * Fix pylint * updates as per PR comments * further adjustments per PR comments * update tests to reflect latest PR changes * refactor getItemLabel in node-list-row-list * fix spelling in random-data * further refactoring of getItemLabel Co-authored-by: Merel Theisen <[email protected]> * quick fix to ensure selector works for pipelines with no defined modular pipelines * delete unneeded selector * delete unneeded selector * Bugfix: Ensure JSON backwards-compatibility The application should still work without throwing an error, even when "modular_pipelines" is not present in the JSON dataset Co-authored-by: Merel Theisen <[email protected]> Co-authored-by: Ivan Danov <[email protected]> Co-authored-by: Merel Theisen <[email protected]> Co-authored-by: Richard Westenra <[email protected]>

merelcht added 4 commits March 10, 2021 11:54

WIP add modular pipelines

3e7d660

Expose modular pipelines and add testing data

9947971

Lint

1881afb

undo push of package-lock

601cc1b

merelcht requested review from limdauto and richardwestenra as code owners March 12, 2021 09:38

merelcht added 4 commits March 12, 2021 09:39

Revert package lock

e2c16f3

Revert package lock

67b87b3

Fix lint

aabae49

Merge branch 'main' into modular-pipelines-api

dddbe71

idanov changed the title ~~Backend to send modular pipelines to kedro viz~~ [KED-1951] Backend to send modular pipelines to kedro viz Mar 12, 2021

merelcht requested a review from idanov March 12, 2021 10:19

idanov requested changes Mar 12, 2021

View reviewed changes

merelcht added 3 commits March 15, 2021 13:18

Return modular_pipelines in pipeline data endpoint for selected pipel…

ccbcd39

…ine + update test data

Address comments on PR

b21f042

Cleanup and lint

e719cbd

merelcht requested a review from idanov March 15, 2021 15:17

merelcht added 3 commits March 16, 2021 18:21

Add modular pipelines to datasets and parameter nodes. Some refactori…

3ff42e6

…ng for clarity

Temporarily skip js tests to make sure all python stuff works

1e8425c

Put back JS tests for CI

2465bb4

merelcht commented Mar 17, 2021

View reviewed changes

package/kedro_viz/server.py Show resolved Hide resolved

merelcht commented Mar 17, 2021

View reviewed changes

package/kedro_viz/server.py Outdated Show resolved Hide resolved

limdauto reviewed Mar 17, 2021

View reviewed changes

idanov reviewed Mar 17, 2021

View reviewed changes

merelcht added 3 commits March 17, 2021 15:12

First iteration of addressing comments on PR

46d3eb9

Correctly deduce modular pipeline from dataset

d388e53

Add all modular pipelines for all nodes

ca2728c

merelcht commented Mar 19, 2021

View reviewed changes

merelcht requested review from limdauto and idanov March 19, 2021 11:44

idanov reviewed Mar 19, 2021

View reviewed changes

package/kedro_viz/server.py Show resolved Hide resolved

idanov reviewed Mar 19, 2021

View reviewed changes

package/kedro_viz/server.py Show resolved Hide resolved

limdauto reviewed Mar 19, 2021

View reviewed changes

idanov approved these changes Mar 19, 2021

View reviewed changes

package/kedro_viz/server.py Show resolved Hide resolved

package/kedro_viz/server.py Outdated Show resolved Hide resolved

package/kedro_viz/server.py Show resolved Hide resolved

merelcht added 2 commits March 19, 2021 17:24

Check that dataset namespace is actually a modular pipeline

50ca16c

Undo check if namespace is modular pipeline

0f4da38

merelcht merged commit 2b817a6 into feature/modular-pipelines-complete Mar 22, 2021

merelcht deleted the modular-pipelines-api branch March 22, 2021 10:44

studioswong mentioned this pull request Mar 23, 2021

Update FE tests to match new animals dataset #401

Merged

6 tasks

studioswong mentioned this pull request Apr 15, 2021

[KED-2436] Create modular pipeline visualisation MVP using the tags concept #421

Merged

6 tasks

studioswong mentioned this pull request Apr 16, 2021

3.11.0 #423

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KED-1951] Backend to send modular pipelines to kedro viz #394

[KED-1951] Backend to send modular pipelines to kedro viz #394

merelcht commented Mar 12, 2021 •

edited

Loading

idanov left a comment

limdauto left a comment

idanov left a comment •

edited

Loading

merelcht commented Mar 17, 2021

merelcht Mar 19, 2021

merelcht Mar 19, 2021

limdauto left a comment

limdauto Mar 19, 2021

merelcht Mar 19, 2021

merelcht Mar 19, 2021 •

edited

Loading

idanov left a comment

		"pipelines": [pipeline_key],
		"modular_pipelines": dataset_modular_pipelines,

[KED-1951] Backend to send modular pipelines to kedro viz #394

[KED-1951] Backend to send modular pipelines to kedro viz #394

Conversation

merelcht commented Mar 12, 2021 • edited Loading

Description

Development notes

QA notes

Todo:

Checklist

Legal notice

idanov left a comment

Choose a reason for hiding this comment

limdauto left a comment

Choose a reason for hiding this comment

idanov left a comment • edited Loading

Choose a reason for hiding this comment

merelcht commented Mar 17, 2021

merelcht Mar 19, 2021

Choose a reason for hiding this comment

merelcht Mar 19, 2021

Choose a reason for hiding this comment

limdauto left a comment

Choose a reason for hiding this comment

limdauto Mar 19, 2021

Choose a reason for hiding this comment

merelcht Mar 19, 2021

Choose a reason for hiding this comment

merelcht Mar 19, 2021 • edited Loading

Choose a reason for hiding this comment

idanov left a comment

Choose a reason for hiding this comment

merelcht commented Mar 12, 2021 •

edited

Loading

idanov left a comment •

edited

Loading

merelcht Mar 19, 2021 •

edited

Loading