-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make groupby transform-like op order match original data order #8720
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #8720 +/- ##
================================================
- Coverage 10.67% 10.61% -0.07%
================================================
Files 110 116 +6
Lines 18271 19003 +732
================================================
+ Hits 1951 2017 +66
- Misses 16320 16986 +666
Continue to review full report at Codecov.
|
Table(value_columns._data), periods, fill_value | ||
) | ||
result = self.obj.__class__._from_table(result) | ||
result = self._mimic_pandas_order(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a multi-column sort, which we can avoid by appending a column 0...N
to the dataframe before the groupby and then sorting by that single column later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catched up offline, this is certainly a good optimization to our current approach. To achieve this we would require libcudf to perform a "no-op" on the sequence column. However a "no-op" wouldn't fit in our current libcudf aggregation framework because they are required to be binary (reduction) ops.
We discussed alternatives but settled upon it's best to just merge what we have so far and raise an issue to track the optimization thoughts with more people joining the dicussion.
@beckernick The issue(#8714) this PR is fixing was scoped to |
Going to go ahead and move it. |
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending merge conflicts
@gpucibot merge |
Closes #8714
This PR makes transform-like ops return results with orders matching that of inputs. For example:
groupby.shift
This would affect
groupby.scan
andgroupby.shift
.