Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.2: xform: use ordering from LIMIT as a hint for streaming group-by #94603

Merged
merged 2 commits into from
Jan 5, 2023

Conversation

msirek
Copy link
Contributor

@msirek msirek commented Jan 3, 2023

Backport 1/1 commits from #93858.

/cc @cockroachdb/release


Fixes #93410

A query with a grouped aggregation, a LIMIT and an ORDER BY may not always explore the best-cost query plan.

Due to the existence of unique constraints on a table, the set of grouping columns may be reduced during normalization via rule ReduceGroupingCols such that it no longer includes columns present in the ORDER BY clause. This eliminates possibly cheap plans from consideration, for example, if the input to the aggregation is a lookup join, it may be cheaper to sort the input to the lookup join on the ORDER BY columns if they overlap with the grouping columns, so that a streaming group-by with no TopK operator can be used, and a full scan of the inputs to the join is avoided.

This fix adds a new exploration rule which ensures that a grouped aggregation with a LIMIT and ORDER BY clause considers using streaming group-by with no TopK when possible.

Release note (bug fix): This patch fixes join queries involving tables with unique constraints using LIMIT, GROUP BY and ORDER BY clauses to ensure the optimizer considers streaming group-by with no TopK operation, when possible. This is often the most efficient query plan.

Release justification: low risk fix for missing exploration of streaming group-by, disabled by default


sql: add optimizer_use_limit_ordering_for_streaming_group_by session setting

This commit adds the optimizer_use_limit_ordering_for_streaming_group_by
session setting that defaults to false. When true, an exploration
rule which uses the ordering specified in a
SELECT ... GROUP BY ... ORDER BY ... LIMIT n; statement as an
interesting ordering to require from the input to the group-by
expression, possibly eliminating a top-k operation. When false, the
exploration rule is disabled. This setting allows backport of the new
exploration rule without impacting query plans of running clusters.

Release note: None

@msirek msirek requested a review from a team as a code owner January 3, 2023 06:53
@msirek msirek requested a review from cucaroach January 3, 2023 06:53
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msirek msirek added the do-not-merge bors won't merge a PR with this label. label Jan 3, 2023
@msirek msirek marked this pull request as draft January 3, 2023 06:53
@msirek msirek force-pushed the backport22.2-93858 branch from 1f154d4 to f7a9ff4 Compare January 3, 2023 22:19
@msirek msirek removed the do-not-merge bors won't merge a PR with this label. label Jan 3, 2023
@msirek msirek marked this pull request as ready for review January 3, 2023 22:22
@msirek msirek changed the title [DNM] release-22.2: xform: use ordering from LIMIT as a hint for streaming group-by release-22.2: xform: use ordering from LIMIT as a hint for streaming group-by Jan 3, 2023
@msirek msirek force-pushed the backport22.2-93858 branch from f7a9ff4 to f8033dc Compare January 3, 2023 22:37
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r1, 10 of 10 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @rharding6373)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a pressing motivation for backporting this rule to v22.2? I think we should probably gate it behind a session setting to avoid changing customers' query plans in a patch release.

FYI, you can backport commits from two PRs with the backport tool: backport 12345 98765 -r 22.2 (not sure if that's what you did, but looks like you did this more manually based on the PR description).

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @rharding6373)

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a pressing motivation for backporting this rule to v22.2?

This is being backported per request from @Leeeeeeeroy-Jenkins for issue https://github.com/cockroachlabs/support/issues/1949

I think we should probably gate it behind a session setting to avoid changing customers' query plans in a patch release.

It is gated behind session setting optimizer_use_limit_ordering_for_streaming_group_by in this PR.

FYI, you can backport commits from two PRs with the backport tool...

Thanks

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @rharding6373)

@mgartner
Copy link
Collaborator

mgartner commented Jan 4, 2023

I think we should probably gate it behind a session setting to avoid changing customers' query plans in a patch release.

I just saw #94672. Can you include that commit in this backport?

@mgartner
Copy link
Collaborator

mgartner commented Jan 4, 2023

It is gated behind session setting optimizer_use_limit_ordering_for_streaming_group_by in this PR.

Oh I see, sorry!

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the session setting first on 22.2, that's why you don't see a backport message of the 2nd commit.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @rharding6373)

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @cucaroach)

Mark Sirek added 2 commits January 4, 2023 20:46
Fixes cockroachdb#93410

A query with a grouped aggregation, a LIMIT and an ORDER BY
may not always explore the best-cost query plan.

Due to the existence of unique constraints on a table, the set of
grouping columns may be reduced during normalization via rule
ReduceGroupingCols such that it no longer includes columns
present in the ORDER BY clause. This eliminates possibly cheap
plans from consideration, for example, if the input to the
aggregation is a lookup join, it may be cheaper to sort the
input to the lookup join on the ORDER BY columns if they overlap
with the grouping columns, so that a streaming group-by with no
TopK operator can be used, and a full scan of the inputs to
the join is avoided.

This fix adds a new exploration rule which ensures that a grouped
aggregation with a LIMIT and ORDER BY clause considers using
streaming group-by with no TopK when possible.

Release note (bug fix): This patch fixes join queries involving
tables with unique constraints using LIMIT, GROUP BY and ORDER BY
clauses to ensure the optimizer considers streaming group-by with
no TopK operation, when possible. This is often the most efficient
query plan.
…setting

This commit adds the `optimizer_use_limit_ordering_for_streaming_group_by`
session setting that defaults to `false`. When `true`, an exploration
rule which uses the ordering specified in a
`SELECT ... GROUP BY ... ORDER BY ... LIMIT n;` statement as an
interesting ordering to require from the input to the group-by
expression, possibly eliminating a top-k operation.  When `false`, the
exploration rule is disabled. This setting allows backport of the new
exploration rule without impacting query plans of running clusters.

Release note: None
@msirek msirek force-pushed the backport22.2-93858 branch from f8033dc to d102acb Compare January 5, 2023 05:28
Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach and @DrewKimball)

@msirek msirek merged commit 55469ba into cockroachdb:release-22.2 Jan 5, 2023
@msirek msirek deleted the backport22.2-93858 branch January 5, 2023 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants