-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thinking about improving data updates #647
Comments
@monfera Thanks for the feedback. I'd vote 👎 for bringing any functional reactive programming library. Like you point our update system is in dire need of a refactor, but an in-house seems best in terms of scaling and maintenance. We already have decent building block namely |
Beat me to it. In any case: While I agree that we should improve our data model, I'm not sure that using something like mobx is the right choice - reactive programming is great for UI and scenarios where updates don't require immense calculation and/or are direct in their codepaths, but for our uses, I imagine we would quickly find ourselves patching things to fit our use case where sometimes the data needs to be transformed by A, then B, then A again, before we can render. What they provide also loses value when not working with in-memory state - there is plenty of plotly.js state wrapped up in SVG dom, so until we separate that out, the transition would be quite rocky. I hate to reinvent the wheel, but I'm of the opinion that what we may need is closer to a tractor tread. I'd advocate instead for creating a strict pattern for updates that works for us, and it very likely will be more of a puppetmaster pattern. |
Whether we expect it from our own utilities/patterns or an external library, do we have roughly similar notions in mind about the needs?
Things likeexisting codebase, test case coverage, documentation, examples etc. incorporate a lot of work already spent, and lessons learnt. Which is why I'd like to learn more about Puppetmaster (is it this one?)? Also what do you mean by tractor tread in this context? |
@mdtusz on a second thought, you more likely mean Puppet (vs Chef), idempotence concept etc. |
I wasn't really referring to Chef/Puppet at all - those aren't really relevant here. I meant more just a pattern where some section of code is in charge of orchestrating our update operations - albeit in a cleaner and more organized way than we currently do. Perhaps using the term puppetmaster was misleading. |
Closing it in favor of #648. |
tl;dr
There's more and more code that couples the aspect of plotting logic with the aspect of incrementally propagating changes, e.g. see all things going on in
Plotly.restyle
. Would be good to discuss ways to improve on the situation. Manual code leads to a tangle and some small, simple library focused on change propagation e.g. MobX would be worth looking into.Plotting turns a stream of user intent into a stream of side effects such as DOM updates
Plotting can be conceived of as a black box:
Plotly.plot
,Plotly.restyle
,Plotly.relayout
, animation inducing user calls as well as DOM events such aswindow.resize
andmousedown
The use of 'stream' highlights the fact that with user pointer operations, restyle/relayout, animation etc. generally make plotting a temporal process, rather than something that can be modeled with a function with some input JSON and an output SVG - even if some of the uses are as simple as this special case.
Plotting logic is a directed acyclic graph of computation nodes
We have multiple pieces of input (e.g.
data[0].x
) at the input and DOM mutating calls as the output. However there's complex calculation in the middle that can be thought of as a DAG. For example,x
vector serves as the basis for calculating a[min, max]
domain that will determine the bounds of the X axisx
vector is also trivially input to scatter point positions, however, ascale
transform converts domain values to e.g. pixel coordinatesx
vectorx
vector is; maybe defaulting from scatterplot to a density plot at some thresholdAll such calculations themselves can be input to downstream calculations.
Plotting needs to be economical
While it would be possible to make a single function whose inputs are
{domRoot, userIntentHistory}
, it's impractical: response times with a naive implementation would be too high (keeping `userIntentHistory is merely of modest size impact). There's no way to recompute everything from scratch and expect a 60FPS frame rate when turning a WebGL plot or animating something.This means that there needs to be some kind of caching, therefore state management. The sole purpose of maintaining state is caching (besides this, we may retain
userIntentHistory
to allow time travel, and of course the output streams are linked to calls that modify the DOM).Means of reducing recomputation costs
Ideally we'd like to
x
it needs to lead to an increased visible X axis domain, provided it's set to automatic. However, if the newly inserted value is inside the bounds, there's no need to recalculate anything that depends only on the[min, max]
domain. Sure, sometimes there's no harm due to speed of recalculation or lack of need for speed, but there are cases when it's useful to be fairly granular about recalculations due to some specific performance need. Solving these specific performance needs one by one, without a formal change propagation approach is brittle.[min, max]
bounds, as opposed to inserting it in the preexisting large vector and applying the vectorextent
calculation. Similarly, many types of aggregates can be calculated on-line as well as batch. For example, mean, variance and standard deviation.Some possible tools
Handwritten userland JavaScript isn't quite good for managing a dependency graph, because given enough nodes and optimization rounds, there will be inevitable cache invalidation issues, and potentially, memory leaks. Keeping things consistent and in in sync is also a challenge especially in the presence of asynchronous events. Most importantly, coupling the plot logic aspect with the incremental recalculation aspect makes both aspects hard to decipher, debug and further develop.
There are a lot of tools that provide some kind of framework for calculating and propagating values that can change over time, responding to input, inspired by Functional Reactive Programming. Without endorsing any of these excellent libraries (xstream, most.js etc.) perhaps MobX would feel closest to the current architecture in that it gives you objects that have properties acting like calculated spreadsheet cells, and as @etpinard suggested in the 2.0 wishlist, object-oriented, but investigation would be needed to see how it fits. All these libs are around 10k compressed.
History
We've touched on related topics in the past; a few inspirations:
scatter3d
snail trail lines [WIP] #617 (comment)The text was updated successfully, but these errors were encountered: