-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preliminary support for keeping broadcast exchanges on GPU when AQE is enabled #448
Conversation
build |
notes this is going to conflict with #442 as well |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question/nit, but it looks good to me despite it.
extensions: SparkSessionExtensions, | ||
ruleBuilder: SparkSession => Rule[SparkPlan]): Unit = { | ||
// not supported in 3.0.0 but it doesn't matter because AdaptiveSparkPlanExec in 3.0.0 will | ||
// never allow us to replace an Exchange node, so they just stay on CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we throw an exception here to be sure of that assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this code will still be called when we run with 3.0.0 but we can't inject the rule, because that feature isn't available in 3.0.0.
When the plugin runs against 3.0.0 with AQE on, our optimizer rules will only get applied to the children of any exchange nodes.
Signed-off-by: spark-rapids automation <[email protected]>
This PR adds support for injecting a query stage preparation rule for Spark versions 3.0.1 and 3.1.0 to tag the SparkPlan with any reasons that operators cannot be supported on the GPU. It also updates the GpuOverrides checks for BroadcastExchangeExec to check for any tagged reasons when new query stages are created (when AQE is enabled).
Note that these changes won't have any effect on functionality yet but will be leveraged once SPARK-32332 is merged.
The TPCH integration tests run with AQE on and off and cover test cases that test this new code path.