-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KED-1951] Backend to send modular pipelines to kedro viz #394
[KED-1951] Backend to send modular pipelines to kedro viz #394
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the code combines all modular pipelines from all registered pipelines. However for the proper functioning of the frontend, we need the modular pipelines only from the currently selected registered pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments. I think we are nearly there. Could we add an unit tests for _expand_namespace
to document its behaviour please? Also if we make the logic to prettify a modular pipeline name into a function to unit test it as well, that'd be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see we're getting there and almost ready! There's two main things to address here:
- In the current implementation, the membership of a dataset is determined by the nodes using it, which is incorrect. Modular pipeline membership is determined entirely from the name of the dataset, otherwise it will produce incorrect result.
- Currently the modular pipelines for the whole pipeline do not include all expanded pipelines. Not sure if that is added to the requirement in the ticket and we can probably get away for now without adding it, but the most correct definition of the top level
modular_pipelines
field should be a superset of allmodular_pipelines
fields of all the nodes in the pipeline. That's actually a very easy way to implement it as well, so maybe worth looking into doing it this way. Could be in a separate PR though, since it's not critical :)
Well done 👏 on actually prompting a refactoring for the code here and highlighting how confusing it is already.
I'm definitely missing something here. How do you get the modular pipeline from the name of the dataset? How does a user add that? 🤯
I might be misunderstanding something, but the modular pipelines added for the full pipeline does include all extended pipelines. If you look at the
|
"source": "e27376a9", | ||
"target": "1769e230" | ||
"id": "pipeline2.data_science", | ||
"name": "Data Science" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@limdauto Added pipeline2.data_science
and pipeline1.data_science
to show the "nameclash" case.
def get_pipelines(): | ||
return { | ||
"de": de_pipeline(), | ||
"ds": ds_pipeline(), | ||
"pre_ds": pipeline(pre_ds_pipeline(), namespace="pipeline1.data_science"), | ||
"ds": pipeline(ds_pipeline(), namespace="pipeline2.data_science"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the setup to include modular pipelines properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from one last comment around dataset namespace, this is it :D Thank you so much for pushing through!
"pipelines": [pipeline_key], | ||
"modular_pipelines": dataset_modular_pipelines, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed that people can technically name their dataset foo.bar
with foo
& bar
not being modular pipeline, so we need to check whether they are in the valid modular pipelines list (which comes from the nodes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, thanks for flagging that! I had that in mind for parameters, but will check for it for datasets as well 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 50ca16c
Removed this again as agreed in standup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done 🎉 Super curious to see this served to Viz. I think we can start experimenting even before we have the parameters
work done.
…oncept (#421) * [KED-1951] Backend to send modular pipelines to kedro viz (#394) * WIP add modular pipelines * Expose modular pipelines and add testing data * Lint * undo push of package-lock * Revert package lock * Fix lint * Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data * Address comments on PR * Cleanup and lint * Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity * Temporarily skip js tests to make sure all python stuff works * Put back JS tests for CI * First iteration of addressing comments on PR * Correctly deduce modular pipeline from dataset * Add all modular pipelines for all nodes * Check that dataset namespace is actually a modular pipeline * Undo check if namespace is modular pipeline * updated FE tests to match new animals dataset (#401) * Add modular pipelines to parameter nodes (#402) * WIP add modular pipelines * Expose modular pipelines and add testing data * Lint * undo push of package-lock * Revert package lock * Revert package lock * Fix lint * Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data * Address comments on PR * Cleanup and lint * Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity * Temporarily skip js tests to make sure all python stuff works * Put back JS tests for CI * First iteration of addressing comments on PR * Correctly deduce modular pipeline from dataset * Add all modular pipelines for all nodes * Check that dataset namespace is actually a modular pipeline * Undo check if namespace is modular pipeline * Add modular pipelines for parameter nodes * Verify if modular pipelines listed are actual modular piplines * Temporarily disable JS tests to make sure other steps pass * Put back JS tests, all other checks pass ✅ * Update package/kedro_viz/server.py Co-authored-by: Ivan Danov <[email protected]> * Address review comments * Treat dataset node modular pipelines the same as task node modular pipelines. Co-authored-by: Ivan Danov <[email protected]> * Send correct format for modular pipelines from /pipelines endpoint (#408) * Modular pipeline tags concept (#410) * set up store functions for incoming modular pipeline data * added additional test for modular pipeline * set up flag for modular pipeline * set up selector to get list of modular pipelines and nodes * add ncheck for node modular pipeline data in selector * set up modular pipeline on sidebar * refactor node-list to enable change of both modular pipeline and tags * further setup reducer and node selection * added item label check * hide modular pipeline features behind a flag * fix failing tests and set up new data set * added tests for modular pipeline actions * further revisions * enable indented styling for lazy list * update readme to explain modular pipeline and tag filtering behaviour * Fix pylint * updates as per PR comments * further adjustments per PR comments * update tests to reflect latest PR changes * refactor getItemLabel in node-list-row-list * fix spelling in random-data * further refactoring of getItemLabel Co-authored-by: Merel Theisen <[email protected]> * quick fix to ensure selector works for pipelines with no defined modular pipelines * delete unneeded selector * delete unneeded selector * Bugfix: Ensure JSON backwards-compatibility The application should still work without throwing an error, even when "modular_pipelines" is not present in the JSON dataset Co-authored-by: Merel Theisen <[email protected]> Co-authored-by: Ivan Danov <[email protected]> Co-authored-by: Merel Theisen <[email protected]> Co-authored-by: Richard Westenra <[email protected]>
Description
As part of the work on showing modular pipelines in viz.
Development notes
Added modular pipelines as top level key as well as to individual nodes (both "node"-nodes and parameter/dataset nodes).
Merging it into feature branch
feature/modular-pipelines-complete
, where the FE changes will be merged as well. Then we'll do a full check and make sure it all works before creating a final PR to main.QA notes
Update test setup and
animals.mock.json
to contain modular pipeline data.FYI: JS tests will fail now I've updated the test data. @studioswong will address these tests once this PR has been merged to the main feature branch
feature/modular-pipelines-complete
Todo:
Checklist
RELEASE.md
fileLegal notice
I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":
I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorised to submit this contribution on behalf of the original creator(s) or their licensees.
I certify that the use of this contribution as authorised by the Apache 2.0 license does not violate the intellectual property rights of anyone else.