-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use stable grouping set symbol orderings #18721
Conversation
d4681c2
to
2b4af81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -1887,7 +1886,7 @@ public PhysicalOperation visitGroupId(GroupIdNode node, LocalExecutionPlanContex | |||
|
|||
int outputChannel = 0; | |||
|
|||
for (Symbol output : node.getGroupingSets().stream().flatMap(Collection::stream).collect(Collectors.toSet())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be enough to say toImmutableSet
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this particular usage site it would be, but the same logic of producing distinct grouping set symbols is used in a few other places inside of GroupIdNode
so I think it makes sense to just use it as a public method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be nice to separate "fix stability" from "reduce code duplication". maybe two commits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately all of these changes are necessary to avoid the usage of unstable-ordering Set
implementations, so breaking the commit into two would just introduce temporary changes that would immediately be deleted by the code de-duplication refactor- so I think it's simpler to leave it as a single commit.
core/trino-main/src/main/java/io/trino/sql/planner/plan/GroupIdNode.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/plan/GroupIdNode.java
Show resolved
Hide resolved
Previously, GroupIdNode grouping set and output symbol orderings were potentially unstable for the same logical sub-plan due to the use of set constructions that were not order-preserving. While this did not affect correctness, it could result in different symbol orderings within grouping sets in differing branches of UNION ALL operations with the same query on both sides. For example, a query like: (SELECT shippriority, custkey, sum(totalprice) FROM orders GROUP BY ROLLUP (shippriority, custkey)) UNION ALL (SELECT shippriority, custkey, sum(totalprice) FROM orders GROUP BY ROLLUP (shippriority, custkey)) could result in logical plans with GroupIdNode grouping sets of either: 1. [[],[“tpch:shippriority$gid”],[“tpch:shippriority$gid”,“tpch:custkey$gid”]] 2. [[],[“tpch:shippriority$gid”],[“tpch:custkey$gid”,“tpch:shippriority$gid”]] This does not affect correctness, but would make the logical plan harder than necessary for a human to interpret unnecessarily.
2b4af81
to
b4f278a
Compare
Description
Previously,
GroupIdNode
grouping set and output symbol orderings were potentially unstable for the same logical sub-plan due to the use of set constructions that were not order-preserving. While this did not affect correctness, it could result in different symbol orderings within grouping sets in differing branches ofUNION ALL
operations with the same query on both sides. For example, a query like:could result in logical plans with GroupIdNode grouping sets of either:
This unnecessary variation could make the logical plan harder than necessary for a human to interpret.
Release notes
(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: