-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support split non-AST-able join condition for BroadcastNestedLoopJoin #9635
Conversation
Signed-off-by: Ferdinand Xu <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks really good. My main concerns are
- this is only for broadcast nested loop joins. I really would like to see this apply to conditional equi-joins too, as I think that is going to be more common than worst case broadcast nested loop join.
- I would like to see some integration tests, not just unit tests. The unit tests are okay tests, but the integration tests run in more places and on more versions of Spark than the unit tests do.
- Could we add in some tests that deal with higher order functions? We had some issues with higher order functions for tiered project. I don't think there are any problems here, but I would feel a lot better if we actually tested with them in the conditional part of some of these joins.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/AstUtil.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsMeta.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/AstUtil.scala
Outdated
Show resolved
Hide resolved
Still failed some AQE cases. Will take care of them before addressing comments. |
build |
build |
Thanks @revans2 for the reviews. For 1 together with fallback message improvement, I filed a follow-up umbrella issue in #9661. For 2 and 3, updated a few more integration tests.
|
|
||
// 2nd step to replace expression pushing down to child plans in depth first fashion | ||
(condition.map( | ||
_.convertToGpu().mapChildren( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be fine, but I am not 100% sure that convertToGpu
will be idempotent. It might be better to just convert it to the GPU up front and cache it instead of recalling it inside of splitNonAstInternal
. To make that work we probably would have to zip the GPU expression and along with condition.childExprs
. I don't think this is a blocker for the patch to go in. Just being overly cautious about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it seems not very straight forward to eliminate non-idempotent risk here. convertToGpu
only happens when we need to replace it with a new GpuAlias node. We can probably cache those converted expression but in the end (L100), the root BaseExprMeta
will do a real convertToGpu
which converts the entire tree. It seems we can hardly reuse the cache here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now, so lets file a follow on issue to look into it. I see a few ways to fix it, but none of them are that simple, and may have their own hidden pitfalls.
build |
|
||
// 2nd step to replace expression pushing down to child plans in depth first fashion | ||
(condition.map( | ||
_.convertToGpu().mapChildren( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now, so lets file a follow on issue to look into it. I see a few ways to fix it, but none of them are that simple, and may have their own hidden pitfalls.
I filed #9680 as a follow on issue for this. |
Please remember to build against Databricks by adding [databricks] to the title |
…LoopJoin (NVIDIA#9635)" This reverts commit c20a843.
…LoopJoin (NVIDIA#9635)" This reverts commit c20a843. Signed-off-by: Ferdinand Xu <[email protected]>
…LoopJoin (#9635)" (#9695) This reverts commit c20a843. Signed-off-by: Ferdinand Xu <[email protected]>
…NVIDIA#9635) Signed-off-by: Ferdinand Xu <[email protected]>
…databricks] (#9702) * Update Databricks GpuBroadcastNestedLoopJoinExec for AST splitting change Signed-off-by: Jason Lowe <[email protected]> * Hot fix Signed-off-by: Ferdinand Xu <[email protected]> * Support split non-AST-able join condition for BroadcastNestedLoopJoin (#9635) Signed-off-by: Ferdinand Xu <[email protected]> * Address comments --------- Signed-off-by: Jason Lowe <[email protected]> Signed-off-by: Ferdinand Xu <[email protected]> Co-authored-by: Jason Lowe <[email protected]>
This is to fix #8832. This PR addressed the case that non-ast-able join condition can be extracted and pushed down to join's child nodes.
It's needed to have two follow-ups: 1. code refactor to decouple broadcast batch build logic and nested loop join implementation; 2. bump up cudf version and update related changes in Plugin to address changes from rapidsai/cudf#13072 tracked in #8157