-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[window] Add GpuWindowExec requiredChildOrdering #645
Conversation
WindowExec has checks for child ordering and distribution. This commit adds requiredChildOrdering and requiredChildDistribution to GpuWindowExec. Signed-off-by: Mithun RK <[email protected]>
0655ff7
to
99c67db
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM based on comparing to the Spark source code. This seems consistent.
The changes LGTM! Curious if there is a way to test this change? |
Haven't a Scooby Doo, truth be told. :] This change is to assuage concerns that our Exec doesn't share the signature/preconditions of As an aside, there's one more change coming to this PR. |
The main reason to do this is for the optimizations that we have where we could remove a sort, like with operations where we can do a hash aggregate instead of a sort aggregate or where we might replace a sort-merge join with a hash based join. In those cases the output of the join/aggregate operation will not match with how the plan was originally setup, and if spark optimized out a sort, because it was relying on the output of one of the operators already being sorted, then without this patch we will not add it back in properly. I don't know how often in the real world this would happen, but it is a possibility. |
P.S I can look into concocting a test. This change declares what the expected ordering of its inputs are. So this is an assertion check against losing that ordering. |
Hey, thanks Bobby. (Missed your reply.) Thanks for confirming. Based on how we're currently replacing ops/execs, I think we might not run into that scenario (of losing the ordering). Unless AQE changes things in some way. |
Makes sense, we have such code in Partitioning and other places too but I was mainly curious what this would really change as far as functionality goes, so thanks for the explanation Bobby, Mithun. :) |
4e3395e
to
99c67db
Compare
Chaps, apologies for the churn on this PR, during which I seemed to have dismissed approvals from @kuhushukla and @andygrove. I intended to push a workaround for #218 to this PR. It slipped my mind that we will be squashing commits. I will raise a separate PR to address #218 for the |
seems fine, I do worry a little about the Databricks code since we had different output columns there. I would think this is ok since its basically just a pass through. |
Yeah, I did consider that. I think we should be in the clear, since this is the input ordering. |
build |
Thanks for the reviews, @andygrove, @kuhushukla, @revans2, @tgravescs! |
WindowExec has checks for child ordering and distribution. This commit adds requiredChildOrdering and requiredChildDistribution to GpuWindowExec. Signed-off-by: Mithun RK <[email protected]>
WindowExec has checks for child ordering and distribution. This commit adds requiredChildOrdering and requiredChildDistribution to GpuWindowExec. Signed-off-by: Mithun RK <[email protected]>
Closes #486.
WindowExec has checks for child ordering and distribution.
This commit adds requiredChildOrdering and requiredChildDistribution
to GpuWindowExec.
Signed-off-by: Mithun RK [email protected]