Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram autobin #3044

Merged
merged 10 commits into from
Oct 4, 2018
Merged

Histogram autobin #3044

merged 10 commits into from
Oct 4, 2018

Conversation

alexcjohnson
Copy link
Collaborator

Closes #3001

A thorough update of our autobinning algorithms for 1D and 2D histograms:

  • Allows partial bin specification: each attribute (size, start, and end) can be provided or omitted independently, and autobin figures out the rest.
  • Removes the autobin(x|y) attributes, but maintains their behavior in both newPlot/react and restyle.
  • Forces stacked/grouped histograms to match size and have compatible start values.
    • The first explicit size applies to all traces in the group
    • The first explicit start is used precisely, all other traces we use either the explicit start or that trace's minimum data value but shift it so the bin edges match.
    • 2D histograms have no cross-trace coupling, only 1D.
  • Notice that I made no changes to any of the mocks or baselines. Changes are limited to more complex cases, particularly concerning multiple grouped autobinned traces.

I also fixed a couple of small issues that cropped up along the way:

  • I let cleanDate behave more like coerce by silently returning the default value on undefined input, but still generating a log error message for invalid dates and adding log errors for non-finite numbers. c5cb42c
  • We had a funny issue where legend.traceorder would lose its reversed default when you hid all traces in the legend... so clicking the same spot after that would re-show a different trace than the one you intended! To fix this I had supplyDefaults run supplyLayoutDefaults for all trace modules, not just visible ones. I suppose this could be overkill as it will include visible: false as well as legendonly, but that seems to me better than what we have now. e066bb9

cc @etpinard @antoinerg

@etpinard etpinard added this to the v1.42.0 milestone Sep 25, 2018
@etpinard etpinard added bug something broken feature something new status: reviewable labels Sep 25, 2018
Copy link
Contributor

@etpinard etpinard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this worked out pretty well 💪 ⛵ 🍸

I really like the new-found use of _module.cleanData. It's nice way to confine somewhat-hacky logic to a small scope. Going forward, it might be the solution to other "missing fields in fullData" bugs cc #2834.

I found one thing that's probably worth adding to the PR description for future reference: how auto-binning now considers sample data from multiple traces.

Now, I'll probably need to go through Histogram.calc again to make full sense of it, but here are a few blocking comments in the meantime:

  • We should ask RCE team to QA this thing (cc @nicolaskruchten )
  • We should test this in the old toolpanel
  • We should append the (x|y)bins* attribute descriptions. Mainly, histogram and histogram2d should not longer use the same description (as histogram has cross-trace coupling). Moreover, we should make sure _deprecated attributes still show up on https://plot.ly/javascript/reference/
  • Jasmine test coverage looks solid, but we should add at least one image mock
  • Get Plotly.validate to work with (x|y)bins* attributes

src/plots/plots.js Show resolved Hide resolved
src/traces/histogram2dcontour/attributes.js Outdated Show resolved Hide resolved
src/plot_api/plot_api.js Show resolved Hide resolved
src/traces/bar/layout_defaults.js Show resolved Hide resolved

handleBinDefaults(traceIn, traceOut, coerce, [sampleLetter]);
// Note: bin defaults are now handled in Histogram.cleanData
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's probably bad news for Plotly.validate.

We'll need to either bring back the coerce calls here or make plot_api/validate.js call _module.cleanData. Come to think of it, the latter sounds like the "correct" thing to do with or w/o this new logic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_module.cleanData is in supplyDefaults so this doesn't cause problems. I'll add a test though to 🔒 it.

Given its location in the pipeline - and the uses we're accreting for it - cleanData is probably not a great name, perhaps we should change it to _module.crossTraceDefaults?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_module.cleanData is in supplyDefaults so this doesn't cause problems.

Oh right, I got confused with the only cleanData. Great!

is probably not a great name, perhaps we should change it to _module.crossTraceDefaults?

This would be great! No need to do in this PR of course. A new issue or a comment in #749 would be good enough for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 autobin/bins in validate -> 76eee4b

cleanData -> crossTraceDefaults -> 5d930ce

src/traces/histogram/clean_data.js Outdated Show resolved Hide resolved
src/traces/histogram/clean_data.js Outdated Show resolved Hide resolved
src/traces/histogram/calc.js Show resolved Hide resolved
src/traces/histogram/calc.js Show resolved Hide resolved
src/traces/histogram/calc.js Show resolved Hide resolved
@nicolaskruchten
Copy link
Contributor

I'd be happy to give this guy an RCE QA pass when it's ready!

@nicolaskruchten
Copy link
Contributor

Should we wait till it hits master and do it as part of pre-release QA? or want us to do it pre-dancer?

@alexcjohnson
Copy link
Collaborator Author

@nicolaskruchten pre-dancer might be nice, as it's going to need some changes in RCE - ie in order to re-enable autobin after this PR we either need a special button to delete the (x|y)bins container or (maybe even better) the "clear this field" option we've been discussing wanting to add to all fields - that way the user could use the full new functionality of auto-determining start, end, and size all separately.

Note that even if (as @etpinard suggested above) I put autobin(x|y) back into the schema, it still won't show up in fullData (if you went in and inserted it in the trace anyway or used it in restyle we do preserve the previous behavior... but it's just a compatibility shim, we immediately delete it from the trace)

@etpinard
Copy link
Contributor

@nicolaskruchten pre-dancer might be nice,

+1 Pre-dancer would be best.

@alexcjohnson
Copy link
Collaborator Author

We should test this in the old toolpanel

I assume we have an autobin toggle there, it's going to need to change to something like what I described in #3044 (comment) (probably the simpler option is better for toolpanel)

We should append the (x|y)bins* attribute descriptions. Mainly, histogram and histogram2d should not longer use the same description (as histogram has cross-trace coupling). Moreover, we should make sure _deprecated attributes still show up on https://plot.ly/javascript/reference/

improved the attribute descriptions - and moved them out of _deprecated - in 76eee4b
filed issue for reference docs plotly/documentation#1104

Jasmine test coverage looks solid, but we should add at least one image mock

good call, I also used that to ensure the subplot logic works right -> 7909f53

Get Plotly.validate to work with (x|y)bins* attributes

part of 76eee4b

@@ -438,17 +438,17 @@ plots.supplyDefaults = function(gd, opts) {
plots.supplyLayoutModuleDefaults(newLayout, newFullLayout, newFullData, gd._transitionData);

// Special cases that introduce interactions between traces.
// This is after relinkPrivateKeys so we can use those in cleanData
// This is after relinkPrivateKeys so we can use those in crossTraceDefaults
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grepping for cleanData in streambed, I found:

image

@VeraZab does this ring a bell? Is that line still getting used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 157351 - looks like it's pulling in a whole plotly.js bundle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good 👁️ , false alarm.

@etpinard
Copy link
Contributor

All my comments have been addressed 🥇

Hopefully the RCE will be happy with the changes too 🙏

@alexcjohnson
Copy link
Collaborator Author

https://github.com/plotly/streambed/pull/11589 adapted the toolpanel for this change. @nicolaskruchten @VeraZab have you gotten a chance to look at integrating it with RCE, per #3044 (comment)?

@nicolaskruchten
Copy link
Contributor

I have not, but I can do it today :)

@nicolaskruchten
Copy link
Contributor

OK, here's a WIP ... behaving not great: https://github.com/plotly/react-chart-editor/tree/autobin ... that branch will stay up to date with this one, so just rm -rf node_modules && yarn install to resync, then yarn start to run the dev app which includes a JSON inspector and some sample data. tips.total_bill has a nice distribution :)

@alexcjohnson
Copy link
Collaborator Author

After we got it plumbed up to the correct code, @nicolaskruchten assures me that RCE is in fact happy with these changes - they will not have the bandwidth to add reset to auto values right away, but that's a general feature in the queue to support for all fields so it won't be missing for too long.

@etpinard AFAICT this is now ready to go. Anything else?

@etpinard
Copy link
Contributor

etpinard commented Oct 4, 2018

Anything else?

Nothing else. 💃

@alexcjohnson alexcjohnson merged commit 30ed4a4 into master Oct 4, 2018
@alexcjohnson alexcjohnson deleted the histogram-autobin branch October 4, 2018 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken feature something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reimagining histogram autobin
3 participants