Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advertise CPU sort order and partitioning expressions to Catalyst [databricks] #3719

Merged
merged 18 commits into from
Oct 4, 2021

Conversation

jlowe
Copy link
Contributor

@jlowe jlowe commented Sep 30, 2021

Fixes #3713

Changes the GPU nodes in the Catalyst plan to advertise CPU SortOrder and Partitioning classes and child expressions. This helps the Spark code better recognize that portions of the plan are still compatible from a sorting and partitioning perspective even after the GPU has updated the plan. Spark has a number of places in the code where it is matching against specific Spark CPU partition classes or checking if CPU sort/partition expressions match, Returning the CPU expressions from the GPU nodes' outputPartitioning, outputOrdering, requiredChildDistribution and requiredChildOrderingmethods helps us pass these checks in Apache Spark code.

This is accomplished by passing both the GPU and CPU expressions to exec nodes that need to override their output ordering or partitioning. It uses the GPU expressions during computation but "advertises" the CPU expressions from the Catalyst methods to query the node's intentions.

@jlowe jlowe self-assigned this Sep 30, 2021
@jlowe
Copy link
Contributor Author

jlowe commented Sep 30, 2021

This depends on #3691. Marking as a draft until the dependency is merged.

@jlowe
Copy link
Contributor Author

jlowe commented Sep 30, 2021

build

@jlowe jlowe marked this pull request as ready for review September 30, 2021 22:02
@jlowe
Copy link
Contributor Author

jlowe commented Sep 30, 2021

build

@sameerz sameerz added the bug Something isn't working label Oct 1, 2021
@sameerz sameerz added this to the Sep 27 - Oct 1 milestone Oct 1, 2021
Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this makes sense, I had one comment on whether we wanted to fix or add more explicit asserts if the gpu order was complex (required an eval). I am not sure what the outcome of the failure is here => failed query, or less performant.

@tgravescs
Copy link
Collaborator

build

@jlowe
Copy link
Contributor Author

jlowe commented Oct 1, 2021

build

@jlowe
Copy link
Contributor Author

jlowe commented Oct 1, 2021

build

@jlowe
Copy link
Contributor Author

jlowe commented Oct 1, 2021

Ran into a couple of issues with passing the CPU expressions as part of the regular arguments to the case classes:

  • These arguments would appear in the plan, being mostly redundant with the GPU form of the expressions
  • The assertOnGpu code would "find" these expressions and assert that they're not on the GPU

By passing them as a separate parameter list to the case classes, they no longer appear in the expressions list for these classes. That avoids the CPU expressions appearing in the plan explain output and being seen by generic tree traversal code.

@jlowe
Copy link
Contributor Author

jlowe commented Oct 1, 2021

build

@sameerz sameerz modified the milestones: Sep 27 - Oct 1, Oct 4 - Oct 15 Oct 4, 2021
@tgravescs
Copy link
Collaborator

db 8.2 failures:
` pyspark.sql.utils.IllegalArgumentException: The expression knownfloatingpointnormalized(normalizenanandzero(a#133227)) AS a#133227 is not columnar class org.apache.spark.sql.catalyst.expressions.Alias

`

18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[Left-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[Left-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftSemi-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftSemi-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftAnti-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_right_broadcast_nested_loop_join_condition_missing[LeftAnti-Double][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_left_broadcast_nested_loop_join_condition_missing[Right-Float][IGNORE_ORDER({'local': True}), INCOMPAT]
18:34:16  FAILED ../../src/main/python/join_test.py::test_left_broadcast_nested_loop_join_condition_missing[Right-Double][IGNORE_ORDER({'local': True}), INCOMPAT]

@jlowe
Copy link
Contributor Author

jlowe commented Oct 4, 2021

build

revans2
revans2 previously approved these changes Oct 4, 2021
@jlowe
Copy link
Contributor Author

jlowe commented Oct 4, 2021

build

@jlowe jlowe merged commit e811543 into NVIDIA:branch-21.10 Oct 4, 2021
@jlowe jlowe deleted the fix-aqe-shuffle-coalesce branch October 4, 2021 20:53
cpuLeftKeys,
cpuRightKeys) {

override def otherCopyArgs: Seq[AnyRef] = cpuLeftKeys :: cpuRightKeys :: Nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits:

prefer Seq.empty to Nil.

could also just have cpuLeftKeys ++ cpuRightKeys ?

child: SparkPlan)(
override val cpuPartitionSpec: Seq[Expression]) extends GpuWindowInPandasExecBase {

override def otherCopyArgs: Seq[AnyRef] = cpuPartitionSpec :: Nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why :: Nil

@@ -78,6 +78,9 @@ case class GpuFileSourceScanExec(

override def otherCopyArgs: Seq[AnyRef] = Seq(rapidsConf)

// All expressions are filter expressions used on the CPU.
override def gpuExpressions: Seq[Expression] = Nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Seq.empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] AQE shuffle coalesce optimization is broken with Spark 3.2
6 participants