-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter-groupby interaction #1892
Conversation
@alexcjohnson @etpinard I added some more challenging tests for transforms. The actual change to groupby was smaller than I expected (and deserves a close look). I implemented recursive transform application to lock down the way transforms are recursively expanded and iterated, but I wasn't actually able to come up with a test case that it solves, so I removed it. It might be worth keeping in mind though since I suspect it might be a more robust/extensible way to lock down application of transforms. |
src/transforms/groupby.js
Outdated
// destination. This function does not break the array reference | ||
// connection between the split transforms it creates. That's handled in | ||
// initialize, which creates a new empty array for each arrayAttr. | ||
function cloneTransforms(newTrace) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting fix 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rreusser is that something we should do for all transforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh hah - of course, because the original Lib.extendDeepNoArrays({}, trace)
doesn't dive into the transforms
array. Do we need a Lib.extendDeepOnlyTheseArrays({}, trace, ['transforms'])
? 🙈
is that something we should do for all transforms?
This only applies to transforms that generate multiple traces, right? I do kind of like the idea of recursive transforms, but until we do something like that, that might be more inherently robust, I think we should manage this case-by-case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So… arrayAttrs
does pick up on data_array arrays nested inside transforms. The particular challenge is executing a transform after another has created expanded traces. Because if the first transform splits it into multiple traces, then each expanded trace needs its own copy of the other transforms since they, for example, have different split target
or groups
data. By expanding deep with no arrays and then cloning the data arrays (which are in arrayAttrs
, they all have their own copy that can be filtered appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed a bit of subtlety. Let me check on the identity of the split transforms
. Unless I'm mistaken, something somewhere is cloning those… Let me dig that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etpinard I wondered the same. The downfall of transforms at the moment is that there are too many corner cases and different things transforms might need to do to lock them down to a completely airtight state such that they can only do what they need to do. The result is that there are some patterns like this that should not be a concern of individual transforms at all but which might need to be applied to them all separately. Or if a clear pattern arises, then it can be abstracted. At the moment, the sample size is just a bit to small for me to see overarching patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if a clear pattern arises, then it can be abstracted. At the moment, the sample size is just a bit to small for me to see overarching patterns.
Very good answer. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcjohnson Never mind. I read my code more carefully. I'm just manually transferring the transforms. 👍
src/transforms/groupby.js
Outdated
if(groups[j] !== groupName) continue; | ||
|
||
arrayAttrs.forEach(pasteArray.bind(0, newTrace, trace, j)); | ||
arrayAttrs.forEach(pasteArray.bind(null, newTrace, trace, j)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't new here - except that the change you made reminded me how bind
always confuses me - but it occurs to me that there's no reason to use it at all if we unwind the forEach
to a for
. Also pasteArray
is a bit inefficient in the way it calls Lib.nestedProperty(newTrace, a)
twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉 Glad to change 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcjohnson @etpinard I've refactored… actually a fair amount of groupby. It was mostly trivial reordering and inlining in order to cut down on function calls and slice
and concat
operations, just to streamline things a bit function-call and memory-wise.
Oh, and also it was iterating over all points for each group, which is unnecessary. Now it just iterates over all points once and uses an index to pick the right expanded trace destination.
@monfera's groupby tests saved me here. Refactored, optimized, and fixed some test cases. I'm content. 👍 |
src/transforms/groupby.js
Outdated
|
||
// Initialize empty arrays for the arrayAttrs, to be split in the next step | ||
for(j = 0; j < arrayAttrs.length; j++) { | ||
Lib.nestedProperty(newTrace, arrayAttrs[j]).set([]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - what you're doing here is pretty similar in fact to what I'm doing with the aggregate transform 🎉
I suspect you could speed it up even a bit more by hanging onto these arrays as you create them, and avoid the inner-loop Lib.nestedProperty
calls below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was bugging me, though I know the effect is small in the grand scheme of things. You know exactly what you'll need, so I just indexed them directly and then looked them up by group index. It's still lots of nestedProperty instantiation, but unless we get fancy (read: fragile), I think this is somewhat minimal for the task at hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
src/transforms/groupby.js
Outdated
|
||
for(i = 0; i < groupNames.length; i++) { | ||
groupName = groupNames[i]; | ||
groupIndex[groupName] = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is tricky. It's actually the array index that I'm indexing. So it should be called indexIndex
😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexLookup
. Better, maybe. A variable by any other name would smell… as… hmm… 🌹
Looks great! 💃 |
I'll defer to someone else on whether this is a bugfix or a feature. Since the output for interactions was simply incorrect/nonsensical before, I'd tend to consider this a bugfix. |
Yeah. This is a 🪲 fix for sure. |
This PR adds a failing test for filter + groupby. The goal is to rework this into a PR that solves the issue. That may or may not be possible due to the way one is a calcTransform and the other is a transform. I'll dig in and see if I can figure out what it would take to fix this.
See: #1843