Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary support for keeping broadcast exchanges on GPU when AQE is enabled #448

Merged
merged 6 commits into from
Jul 31, 2020

Conversation

andygrove
Copy link
Contributor

@andygrove andygrove commented Jul 28, 2020

This PR adds support for injecting a query stage preparation rule for Spark versions 3.0.1 and 3.1.0 to tag the SparkPlan with any reasons that operators cannot be supported on the GPU. It also updates the GpuOverrides checks for BroadcastExchangeExec to check for any tagged reasons when new query stages are created (when AQE is enabled).

Note that these changes won't have any effect on functionality yet but will be leveraged once SPARK-32332 is merged.

The TPCH integration tests run with AQE on and off and cover test cases that test this new code path.

@andygrove andygrove requested a review from tgravescs July 28, 2020 16:17
@andygrove andygrove added this to the Jul 20 - Jul 31 milestone Jul 28, 2020
@andygrove andygrove added the performance A performance related task/issue label Jul 28, 2020
@andygrove andygrove changed the title Aqe tag broadcast Preliminary support for keeping broadcast exchanges on GPU when AQE is enabled Jul 28, 2020
@andygrove andygrove requested a review from nartal1 July 28, 2020 16:19
@andygrove
Copy link
Contributor Author

build

@tgravescs
Copy link
Collaborator

notes this is going to conflict with #442 as well

@andygrove
Copy link
Contributor Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question/nit, but it looks good to me despite it.

extensions: SparkSessionExtensions,
ruleBuilder: SparkSession => Rule[SparkPlan]): Unit = {
// not supported in 3.0.0 but it doesn't matter because AdaptiveSparkPlanExec in 3.0.0 will
// never allow us to replace an Exchange node, so they just stay on CPU
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an exception here to be sure of that assumption?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this code will still be called when we run with 3.0.0 but we can't inject the rule, because that feature isn't available in 3.0.0.

When the plugin runs against 3.0.0 with AQE on, our optimizer rules will only get applied to the children of any exchange nodes.

@andygrove andygrove merged commit 5d7f6b9 into NVIDIA:branch-0.2 Jul 31, 2020
@andygrove andygrove deleted the aqe-tag-broadcast branch July 31, 2020 13:37
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
pxLi pushed a commit to pxLi/spark-rapids that referenced this pull request May 12, 2022
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
Signed-off-by: spark-rapids automation <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants