Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modular pipeline tags concept #410

Conversation

studioswong
Copy link
Contributor

@studioswong studioswong commented Mar 31, 2021

Description

related PR: https://jira.quantumblack.com/browse/KED-2520

This PR is the main development work for the initial MVP for the modular pipeline feature, in allowing the user to select the display of nodes on the flowchart by modular pipelines using a similar interaction as the tag filters on the current sidebar. (see below)

image

Development notes

This PR contains the following main pieces of work:

  1. Initial set up of the modular pipeline selectors to format the data to be consumed for the sidebar
  2. Addition of a series of actions related to toggle the selection of modular pipeline items ( I have set them up in a consistent fashion with the tag actions for code consistency), along with the setup of tests
  3. Set up a new 'modularPipeline' type within the items array used in node-list to allow refactor work of all components related to the sidebar ( e.g node-list, node-list-group, etc)
  4. Refactoring of a series of functions related to the tag filtering logic to cater for the new 'modularPipeline' item type. This includes adding a series of functions to cater for the toggling of modular pipeline items, as well as refactoring the main function that determines the filtering logic, onTagItemChange, to onCategoryItemChange to allow for more category item types beyond tags.
  5. Adding a new display logic for the name of modular pipelines under node-list-row-list to represent the nested relationship of pipelines by indentation
  6. Incorporating modularPipeline type items within the search results so users can search by modular pipelines from the search bar
  7. Addition of a series of modular pipeline selectors used within the node-list component under node-list-items.js
  8. Refactoring of tests to cater for the new moularPipeline item types to test the modular pipeline filter functionality ( all related tests for the filtering by modular pipelines sits under node-list-items.test.js)
  9. Addition of the 'modularpipeline' flag, where the modular pipeline field will only be exposed via enabling this flag
  10. I have also refactored the sections selector to a simple function that takes in the value of the flag to return different object in determining the fields displayed on the sidebar.
  11. Update the demo test dataset and the random-data generator to include the new modular pipeline related fields

QA notes

This feature needs to be tested alongside the new changes to the Kedro-viz BE under the feature/modular-pipelines-complete branch. I also recommend testing this with the new Optimus AI reference project.

If you are testing with the Optimus AI project, please note that currently there are some errors with the display of layers on the flowchart until this ticket is merged. Please test with the layers turned off for the flowchart.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

Legal notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

  • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.

  • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorised to submit this contribution on behalf of the original creator(s) or their licensees.

  • I certify that the use of this contribution as authorised by the Apache 2.0 license does not violate the intellectual property rights of anyone else.

@studioswong
Copy link
Contributor Author

it looks like the pylint tests are currently failing on the CI - will resolve this with the core team in the meantime
image

Copy link
Contributor

@bru5 bru5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This works really nicely.

Looks very good overall, I've still some more testing to do on it but for now I've left a few suggestions.

* @return {array} List of sections
*/
export const getSections = createSelector(() =>
Object.keys(sidebar).map((name) => ({
export const getSections = (flag) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be clearer to name the flag as something like modularPipelineFlag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already elaborated that this refers to the modular pipeline flag in the comments, not sure of the need to repeat this in the naming and opt for a much longer variable name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure it's good that it's documented, but yeah I also think in this case a more verbose name is a good idea, since it's a specific flag the function is looking for.

export const getSections = createSelector(() =>
Object.keys(sidebar).map((name) => ({
export const getSections = (flag) => {
const sidebarObject = sidebar(flag);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sidebarObject is probably the sidebarConfig here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure however it's just a difference of preference in naming?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah in this case the name can be more specific since we know it's a config object, right?

(allNodes, modularPipelines) => {
return modularPipelines.map((modularPipeline) => {
let nodes = [];
Object.keys(allNodes).map((key) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since the return result of map isn't used here and the loop uses side effects it might be best to convert this into a for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment, though I am not entirely sure of the advantages of converting this to a for loop given that in our codebase ( and generally for most JS development nowadays) we often favor using map? This is a pttern that is used in our other selectors as well.

personally I also think using a map allows the code to be much cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on this one too below and suggested using forEach. map is typically used to assign the result of a loop to a variable (e.g. const result = array.map(d => d.foo)), which this bit of code doesn't do. Hence, using forEach or for is clearer to a reader, as it doesn't imply that you intended to return the iterated array.

Copy link
Contributor

@bru5 bru5 Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah map and friends return an array themselves already. Looking closer at what this does though you're right it's a sound idea in this case to convert this to a functional style though instead, but what you need here looks like filter rather than map e.g. something like:

  const nodes = Object.keys(allNodes)
        .filter(key => Boolean(allNodes[key]) && allNodes[key].includes(modularPipeline));

Copy link
Contributor

@richardwestenra richardwestenra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Overall this is great work! I've included a lot of optional suggestions, feel free to include or ignore. The most important things I've marked with a ⚠️ emoji. Thanks! :)

src/config.js Outdated
},
};

export const sidebar = (flag) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style nitpick) I'm going to add the first two of these points to the style guide:

  1. In general, we try to keep the config file to just pure constant values (e.g. strings, numbers and object literals), and put utility functions in other locations, e.g. utils. @bru5 has pulled me up on this before, rightly so. If this needs to be a function, I'd recommend moving the function part of it there.

  2. If this is a function, I'd recommend renaming it to be a verb (e.g. getSidebarConfig), so it's more clear.

  3. Because the only difference between these groups is Categories.ModularPipelines, I'd probably just set that value with a ternary instead. That way, it's easier to see the differences between them and there's less repetition:

export const sidebar = (flag) => ({
  Categories: {
    Tags: 'tag',
    ModularPipelines: flag ? 'modularPipeline' : undefined,
  },
  Elements: {
    Nodes: 'task',
    Datasets: 'data',
    Parameters: 'parameters',
  },
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion Richard! I definitely had intended and would've set it up in your suggested way but unfortunately it would not work with the existing sidebar setup ( having a field as undefined would cause errors in the sidebar).

If course we could refactor the sidebar to change how we accept the sidebar config object, but given that hiding this behind a flag is a temporary measure ( and will subject to iteration in the next round) I had decided to keep it as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah a static config is definitely preferred generally, so would be good to add support for this case at the source if practical.

Flags do make this tricky though so if that's not straightforward then maybe this pattern gets closer to the above suggestion e.g.

export const sidebar = (flags) => ({
  Categories: {
    Tags: 'tag',
    ...(flags.modularPipeline && { 
      ModularPipelines: 'modularPipeline' 
    }),
  },
  Elements: {
    Nodes: 'task',
    Datasets: 'data',
    Parameters: 'parameters',
  },
})

@@ -108,6 +108,7 @@ The following flags are available to toggle experimental features:
- `lazy` - From release v3.8.0. Improved sidebar performance. (default `false`)
- `oldgraph` - From release v3.8.0. Display old version of graph (dagre algorithm) without improved graphing algorithm. (default `false`)
- `sizewarning` - From release v3.9.1. Show a warning before rendering very large graphs. (default `true`)
- `modularpipeline` - From release v3.11.0. Enables filtering of nodes by modular pipelines. Note that selecting both modular pipeline and tag filters will only return nodes that belongs to both categories. (default `false`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) - while i remember, do you still think we should change this to modular? is it not part of this PR?

*/
export const getModularPipelineNodes = createSelector(
[getNodesModularPipelines, getModularPipelineIDs],
(allNodes, modularPipelines) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding the variable name allNodes a bit confusing here - this is the state prop for the modular pipelines that correspond to certain nodes, right?

Is it possible that there might be a node that has a modular pipeline but which isn't included in this list? If so then using this prop in this way could lead to potential bugs.

Copy link
Contributor

@richardwestenra richardwestenra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work :)

@studioswong studioswong merged commit a17ca31 into feature/modular-pipelines-complete Apr 15, 2021
@studioswong studioswong deleted the feature/KED-2436-modular_pipeline_tags_concept branch April 15, 2021 11:41
export const getSections = createSelector(
(state) => sidebar(state.flags.modularpipeline),
(sidebarObject) =>
Object.keys(sidebarObject).map((name) => ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth naming sidebarObject as sidebarConfig here too, since the sidebar function generates an object from the config file?

Comment on lines +37 to +59
/**
* returns an array of modular pipelines with the corresponding
* nodes for each modular pipeline
*/
export const getModularPipelineNodes = createSelector(
[getNodesModularPipelines, getModularPipelineIDs],
(allNodesModularPipelines, modularPipelines) => {
return modularPipelines.map((modularPipeline) => {
let nodes = [];
Object.keys(allNodesModularPipelines).forEach((key) => {
if (
allNodesModularPipelines[key] &&
allNodesModularPipelines[key].includes(modularPipeline)
)
nodes.push(key);
});
return {
modularPipeline,
nodes,
};
});
}
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@studioswong Hey I merged this branch into the one I'm working on, and realised that this selector isn't actually being used anywhere? Anyway I've deleted it on my branch.

studioswong added a commit that referenced this pull request Apr 16, 2021
…oncept (#421)

* [KED-1951] Backend to send modular pipelines to kedro viz (#394)

* WIP add modular pipelines

* Expose modular pipelines and add testing data

* Lint

* undo push of package-lock

* Revert package lock

* Fix lint

* Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data

* Address comments on PR

* Cleanup and lint

* Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity

* Temporarily skip js tests to make sure all python stuff works

* Put back JS tests for CI

* First iteration of addressing comments on PR

* Correctly deduce modular pipeline from dataset

* Add all modular pipelines for all nodes

* Check that dataset namespace is actually a modular pipeline

* Undo check if namespace is modular pipeline

* updated FE tests to match new animals dataset (#401)

* Add modular pipelines to parameter nodes (#402)

* WIP add modular pipelines

* Expose modular pipelines and add testing data

* Lint

* undo push of package-lock

* Revert package lock

* Revert package lock

* Fix lint

* Return modular_pipelines in pipeline data endpoint for selected pipeline + update test data

* Address comments on PR

* Cleanup and lint

* Add modular pipelines to datasets and parameter nodes. Some refactoring for clarity

* Temporarily skip js tests to make sure all python stuff works

* Put back JS tests for CI

* First iteration of addressing comments on PR

* Correctly deduce modular pipeline from dataset

* Add all modular pipelines for all nodes

* Check that dataset namespace is actually a modular pipeline

* Undo check if namespace is modular pipeline

* Add modular pipelines for parameter nodes

* Verify if modular pipelines listed are actual modular piplines

* Temporarily disable JS tests to make sure other steps pass

* Put back JS tests, all other checks pass ✅

* Update package/kedro_viz/server.py

Co-authored-by: Ivan Danov <[email protected]>

* Address review comments

* Treat dataset node modular pipelines the same as task node modular pipelines.

Co-authored-by: Ivan Danov <[email protected]>

* Send correct format for modular pipelines from /pipelines endpoint (#408)

* Modular pipeline tags concept (#410)

* set up store functions for incoming modular pipeline data

* added additional test for modular pipeline

* set up flag for modular pipeline

* set up selector to get list of modular pipelines and nodes

* add ncheck for node modular pipeline data in selector

* set up modular pipeline on sidebar

* refactor node-list to enable change of both modular pipeline and tags

* further setup reducer and node selection

* added item label check

* hide modular pipeline features behind a flag

* fix failing tests and set up new data set

* added tests for modular pipeline actions

* further revisions

* enable indented styling for lazy list

* update readme to explain modular pipeline and tag filtering behaviour

* Fix pylint

* updates as per PR comments

* further adjustments per PR comments

* update tests to reflect latest PR changes

* refactor getItemLabel in node-list-row-list

* fix spelling in random-data

* further refactoring of getItemLabel

Co-authored-by: Merel Theisen <[email protected]>

* quick fix to ensure selector works for pipelines with no defined modular pipelines

* delete unneeded selector

* delete unneeded selector

* Bugfix: Ensure JSON backwards-compatibility

The application should still work without throwing an error, even when "modular_pipelines" is not present in the JSON dataset

Co-authored-by: Merel Theisen <[email protected]>
Co-authored-by: Ivan Danov <[email protected]>
Co-authored-by: Merel Theisen <[email protected]>
Co-authored-by: Richard Westenra <[email protected]>
@studioswong studioswong mentioned this pull request Apr 16, 2021
1 task
@studioswong studioswong self-assigned this Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants