Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: show matplotlib dataset pngs #887

Merged
merged 21 commits into from
Jun 8, 2022

Conversation

tynandebold
Copy link
Member

@tynandebold tynandebold commented Jun 6, 2022

Description

Resolves #783.

Massive thank you to @rashidakanchwala for helping so much with this one.

Development notes

There are a lot of changes here, and that's because I've renamed several files to adopt a more generic approach to the modal formerly named PlotlyModal. That's now simply MetadataModal, since we may want to put other things in there.

QA notes

I've added a Matplotlib Image node to the demo project, so you'll be able to see the implementation if you open Gitpod and take a look at the preview link.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: Antony Milne <[email protected]>
@antonymilne
Copy link
Contributor

I just tried this out on gitpod but it got stuck installing requirements with

ERROR: Cannot install matplotlib==3.5.0 and matplotlib~=3.5.2 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested matplotlib~=3.5.2
    The user requested matplotlib==3.5.0

So I pushed 3c99820 which should fix this.

As far as I can tell there's no reason that matplotlib should be a Python requirement of the kedro-viz package, is there? In fact I'm not sure whether we really need these either... 🤔 But that's a separate issue.

plotly>=4.0
pandas>=0.24

Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github is doing something weird so submitting this before it loses my comments... review in progress.

package/kedro_viz/models/graph.py Outdated Show resolved Hide resolved
package/kedro_viz/models/graph.py Outdated Show resolved Hide resolved
@@ -384,9 +384,8 @@ def test_partitioned_data_node_metadata(self):
data_node_metadata = DataNodeMetadata(data_node=data_node)
assert data_node_metadata.filepath == "partitioned/"

@patch("builtins.__import__", side_effect=import_mock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? Seems to me that we should be able to run the tests without plotly installed (related to the point of removing plotly from requirements.txt).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. @rashidakanchwala? Should I add it back in?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I removed it, because it wasn't actually being used. :S

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is used even though patched_import is not explicitly mentioned in the test. When you @patch it decorates the test function, and doing so also passes on an argument to the test function.

But it turns out it's not needed for another reason: because plotly is a requirement of kedro-viz.

src/components/metadata-modal/metadata-modal.js Outdated Show resolved Hide resolved
src/components/metadata-modal/metadata-modal.scss Outdated Show resolved Hide resolved
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great and all seems to work perfectly!

I will check the thing about requirements. Other than that, I just have minor comments and suggestions really - let me know if you want to discuss anything further!

<div className="pipeline-plot-modal__bottom">
<button
className="pipeline-plot-modal__collapse-plot"
onClick={onCollapsePlotClick}
>
<CollapseIcon className="pipeline-plot-modal__collapse-plot-icon"></CollapseIcon>
<span className="pipeline-plot-modal__collapse-plot-text">
Collapse Plotly Visualization
{metadata?.plot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, but this feels a bit weird to me:

  • here we're doing "if metadata has plot then do plotly; else do matplotlib"
  • in metadata.js we're doing "if metadata hasPlot then do plotly; if metadata hasImage then do matplotlib". This seems logically the same as graph.py with is_plot_node and is_image_node
  • in config.js we have shortTypeMapping which looks like it could also be used to implement the "what type of node is this?" logic

Obviously, the logic here works fine because we only two options that generate the modal, so it's safe to assume that if it's not a plotly then it's matplotlib. But it just feels a little hard to follow to me for anyone reading the codebase afresh. It would be nice if we had some consistent way of handling the "what type of node is this?" logic. Ideally I think we would just use the node's dataset_type property to switch between different behaviours rather than looking to see whether a node contains image or plot.

Possibly this is a premature optimisation since we don't have more than two special cases yet, and it would undoubtedly require a bigger rewrite than the scope of this PR. But it might at least be worth seeing if the logic in this file can be made consistent with the metadata.js logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that a node with a dataset_type of plotly or matplotlib won't contain and image or a plot? If it is, I think we'll still need to check for either image or plot. In fact, this did happen to me yesterday. I added in my new node but didn't do kedro run, so the image wasn't there but the node on the flowchart was present, as was the metadata panel.

I agree with everything else. The logic isn't identical and should be. I'll update it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that is a very good point. If you haven't done a kedro run yet then plot/image won't be defined.

In this case I guess there's two bits of logic we need:

  • something to check the dataset_type (probably using shortTypeMapping? Something like switch shortTypeMapping[dataset_type] or whatever)
  • something to check whether the dataset actually has image or plot defined

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

For now, I've added in these two lines, which is what's done in the metadata file, as you called out above:

const hasPlot = Boolean(metadata?.plot);
const hasImage = Boolean(metadata?.image);

Hopefully that makes things a little more clear. Do you think it's enough? I'm wary of over-optimizing further at this point.

Copy link
Contributor

@antonymilne antonymilne Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's perfect, no need to do any more for now. My only suggestion would be to change this

if (!visible.metadataModal || (!metadata?.plot && !metadata?.image)) {

to use the new hasPlot and hasImage variables. Same goes for the stuff on L65-67.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, curious: is there any reason these explicitly cast to Boolean? Seems like some things in metadata.js are and some aren't:

  const hasCode = Boolean(metadata?.code);
  const isTranscoded = Boolean(metadata?.originalType);
  const showCodePanel = visible && visibleCode && hasCode;
  const showCodeSwitch = hasCode;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added those two to this, as suggested:

if (!visible.metadataModal || (!metadata?.plot && !metadata?.image)) {

The first two are cast as such because they're being converted from strings. The last two don't need it, since they're created from booleans they'll be that type, too.

package/tests/test_models/test_graph/test_graph_nodes.py Outdated Show resolved Hide resolved
Signed-off-by: Tynan DeBold <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
Signed-off-by: Antony Milne <[email protected]>
@antonymilne
Copy link
Contributor

OK, tests should all be fixed now 🤞

In response to my previous question about requirements, pandas and plotly should remain as core kedro-viz requirements because they're needed for the plot shown in the metadata panel for MetricsDataSet. See #554 (comment). matplotlib shouldn't be there though. The right way to get the tests working without matplotlib installed is using the import_mock.

@rashidakanchwala
Copy link
Contributor

OK, tests should all be fixed now 🤞

In response to my previous question about requirements, pandas and plotly should remain as core kedro-viz requirements because they're needed for the plot shown in the metadata panel for MetricsDataSet. See #554 (comment). matplotlib shouldn't be there though. The right way to get the tests working without matplotlib installed is using the import_mock.

I am confused. react-plotly.js is what is used for the plot shown in the FE.

@antonymilne
Copy link
Contributor

@rashidakanchwala yep, but on the backend there's also some stuff done with pandas and plotly to generate the metrics plots: https://github.com/kedro-org/kedro-viz/blob/main/package/kedro_viz/models/graph.py#L668.

Copy link
Contributor

@rashidakanchwala rashidakanchwala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work on your python PR :) looks great!

just nit on the json file having a newline.

also, please if you get a chance to update the release docs with the missing stuff shared on slack :)

Signed-off-by: Tynan DeBold <[email protected]>
@tynandebold tynandebold merged commit 845353d into main Jun 8, 2022
@tynandebold tynandebold deleted the feature/show-matplotlib-dataset-pngs branch June 8, 2022 08:57
cvetankanechevska pushed a commit that referenced this pull request Jun 9, 2022
* displaying the pngs in the metadata panel

Signed-off-by: Tynan DeBold <[email protected]>

* renaming of files, etc; expand matplotlib img

Signed-off-by: Tynan DeBold <[email protected]>

* fix comment in type

Signed-off-by: Tynan DeBold <[email protected]>

* format fix

Signed-off-by: Tynan DeBold <[email protected]>

* fix import

Signed-off-by: Tynan DeBold <[email protected]>

* another format fix

Signed-off-by: Tynan DeBold <[email protected]>

* added matplotlib to reqs

* Added tests

* fix height issue; add matplotlib node to demo project

Signed-off-by: Tynan DeBold <[email protected]>

* update release.md

Signed-off-by: Tynan DeBold <[email protected]>

* Change requirements setup

Signed-off-by: Antony Milne <[email protected]>

* pr-review fixes

Signed-off-by: Tynan DeBold <[email protected]>

* Fix tests

Signed-off-by: Antony Milne <[email protected]>

* Fix tests

Signed-off-by: Antony Milne <[email protected]>

* cleanup

Signed-off-by: Tynan DeBold <[email protected]>

* update release notes

Signed-off-by: Tynan DeBold <[email protected]>

Co-authored-by: Rashida Kanchwala <[email protected]>
Co-authored-by: Antony Milne <[email protected]>
Signed-off-by: Cvetanka <[email protected]>
@tynandebold tynandebold mentioned this pull request Jun 16, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show Matplotlib dataset pngs in the metadata panel
3 participants