xform: use ordering from LIMIT as a hint for streaming group-by #93858

msirek · 2022-12-18T05:48:56Z

A query with a grouped aggregation, a LIMIT and an ORDER BY may not always explore the best-cost query plan.

Due to the existence of unique constraints on a table, the set of grouping columns may be reduced during normalization via rule ReduceGroupingCols such that it no longer includes columns present in the ORDER BY clause. This eliminates possibly cheap plans from consideration, for example, if the input to the aggregation is a lookup join, it may be cheaper to sort the input to the lookup join on the ORDER BY columns if they overlap with the grouping columns, so that a streaming group-by with no TopK operator can be used, and a full scan of the inputs to the join is avoided.

This fix adds a new exploration rule which ensures that a grouped aggregation with a LIMIT and ORDER BY clause considers using streaming group-by with no TopK when possible.

Release note (bug fix): This patch fixes join queries involving tables with unique constraints using LIMIT, GROUP BY and ORDER BY clauses to ensure the optimizer considers streaming group-by with no TopK operation, when possible. This is often the most efficient query plan.

cockroach-teamcity · 2022-12-18T05:49:03Z

This change is

msirek · 2022-12-18T19:56:36Z

This fix is causing a plan change in TPC-H Q18. The old plan takes appr. 4.1s (varies between 4.0s and 4.2s). The new plan with the fix takes appr. 2.5s (usually below this value).

Q18:

SELECT
            c_name,
            c_custkey,
            o_orderkey,
            o_orderdate,
            o_totalprice,
            sum(l_quantity)
        FROM
            customer,
            orders,
            lineitem
        WHERE
            o_orderkey IN (
                SELECT
                    l_orderkey
                FROM
                    lineitem
                GROUP BY
                    l_orderkey HAVING
                        sum(l_quantity) > 300
            )
            AND c_custkey = o_custkey
            AND o_orderkey = l_orderkey
        GROUP BY
            c_name,
            c_custkey,
            o_orderkey,
            o_orderdate,
            o_totalprice
        ORDER BY
            o_totalprice DESC,
            o_orderdate
        LIMIT 100;

old plan:

  distribution: local
  vectorized: true

  • top-k
  │ estimated row count: 100
  │ order: -any_not_null,+any_not_null
  │ k: 100
  │
  └── • group (hash)
      │ estimated row count: 499,392
      │ group by: o_orderkey
      │
      └── • hash join
          │ estimated row count: 2,016,361
          │ equality: (o_custkey) = (c_custkey)
          │ right cols are key
          │
          ├── • merge join
          │   │ estimated row count: 2,000,405
          │   │ equality: (l_orderkey) = (o_orderkey)
          │   │ right cols are key
          │   │
          │   ├── • scan
          │   │     estimated row count: 6,001,215 (100% of the table; stats collected 8 minutes ago)
          │   │     table: lineitem@primary
          │   │     spans: FULL SCAN
          │   │
          │   └── • merge join (semi)
          │       │ estimated row count: 509,090
          │       │ equality: (o_orderkey) = (l_orderkey)
          │       │ left cols are key
          │       │ right cols are key
          │       │
          │       ├── • scan
          │       │     estimated row count: 1,500,000 (100% of the table; stats collected 8 minutes ago)
          │       │     table: orders@primary
          │       │     spans: FULL SCAN
          │       │
          │       └── • filter
          │           │ estimated row count: 509,090
          │           │ filter: sum > 300.0
          │           │
          │           └── • group (streaming)
          │               │ estimated row count: 1,527,270
          │               │ group by: l_orderkey
          │               │ ordered: +l_orderkey
          │               │
          │               └── • scan
          │                     estimated row count: 6,001,215 (100% of the table; stats collected 8 minutes ago)
          │                     table: lineitem@primary
          │                     spans: FULL SCAN
          │
          └── • scan
                estimated row count: 150,000 (100% of the table; stats collected 9 minutes ago)
                table: customer@primary
                spans: FULL SCAN

new plan:

  distribution: local
  vectorized: true

  • limit
  │ count: 100
  │
  └── • group (partial streaming)
      │ estimated row count: 499,392
      │ group by: o_orderkey, o_totalprice, o_orderdate
      │ ordered: -o_totalprice,+o_orderdate
      │
      └── • lookup join
          │ estimated row count: 2,016,361
          │ table: lineitem@primary
          │ equality: (o_orderkey) = (l_orderkey)
          │
          └── • lookup join
              │ estimated row count: 513,151
              │ table: customer@primary
              │ equality: (o_custkey) = (c_custkey)
              │ equality cols are key
              │
              └── • sort
                  │ estimated row count: 509,090
                  │ order: -o_totalprice,+o_orderdate
                  │
                  └── • merge join (semi)
                      │ estimated row count: 509,090
                      │ equality: (o_orderkey) = (l_orderkey)
                      │ left cols are key
                      │ right cols are key
                      │
                      ├── • scan
                      │     estimated row count: 1,500,000 (100% of the table; stats collected 1 minute ago)
                      │     table: orders@primary
                      │     spans: FULL SCAN
                      │
                      └── • filter
                          │ estimated row count: 509,090
                          │ filter: sum > 300.0
                          │
                          └── • group (streaming)
                              │ estimated row count: 1,527,270
                              │ group by: l_orderkey
                              │ ordered: +l_orderkey
                              │
                              └── • scan
                                    estimated row count: 6,001,215 (100% of the table; stats collected 56 seconds ago)
                                    table: lineitem@primary
                                    spans: FULL SCAN

DrewKimball

Nice fix! That new plan looks really nice, almost no blocking operators. That being said, are you sure we can't do this with a little less work by improving the ordering propagation in ordering/group_by.go? I haven't thought it through too deeply yet, but maybe in groupByCanProvideOrdering the CanProjectCols check should use the closure of the grouping columns instead of just the grouping columns themselves - this would include any grouping columns that got removed. And of course there would have to be changes to groupByBuildChildReqOrdering and groupByBuildProvided as well.

Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)

msirek

Interesting idea. I looked into it a bit. It looks like groupByBuildChildReqOrdering only supports finding a required ordering which intersects with the internal ordering of the GroupByExpr:

cockroach/pkg/sql/opt/ordering/group_by.go

Line 49 in 3916aa6

result = result.Intersection(&groupBy.Ordering)

These orderings are generated in GenerateStreamingGroupBy which calls DeriveInterestingOrderings, which looks at indexes or other operations which result in an ordering. The purpose seems to be to reuse an ordering which is already there and just happens to be beneficial instead of looking at the exact ordering required by the parent. So, even if groupByBuildChildReqOrdering is taught to find more orderings compatible with a given grouping, we'd still rely on DeriveInterestingOrderings to build the groupBy.Ordering that we want. The current fix is using a direct hint from the required ordering in place of DeriveInterestingOrderings, so may introduce a sort operation instead of relying on indexes.

Maybe your idea works differently. Perhaps you could say some more about it in case I didn't get the gist of it.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)

DrewKimball

For the case when no internal ordering is specified (which I think is the one we care about), I think the original/canonical groupby should have an empty ordering. And every ordering intersects with the empty ordering, so that check will always succeed for the canonical groupby with no internal ordering. Once the ordering gets propagated to the input, I'd expect DeriveInterestingOrderings to pick it up and generate the streaming groupby.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)

msirek

I see. o.shouldExplore only triggers exploration when the required ordering is empty. I made a quick change to groupByBuildChildReqOrdering but did not see a new ordering when GenerateStreamingGroupBy is called. Anyway, these are some good ideas. Maybe you could dump them in an issue. For now the limited scope of this rewrite rule is more targeted, which should make it safer than more general changes, which would get exercised by many more queries, even if it means the fix requires more lines of code.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373)

DrewKimball

Hm, yeah, it's not as simple to make it work as I thought. Had a couple nits but this

Reviewed 9 of 9 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @msirek and @rharding6373)

pkg/sql/opt/xform/groupby_funcs.go line 201 at r1 (raw file):

	}
	limitExpr, ok := limitRel.(*memo.LimitExpr)
	if !ok {

[nit] These three conditions are all guaranteed by the optgen code right? I think it's ok to remove them. Also, it should be possible to change the type of limitRel to *memo.LimitExpr to avoid the assertion.

pkg/sql/opt/xform/groupby_funcs.go line 220 at r1 (raw file):

		return
	}
	groupingCols = groupingCols.Union(orderingColsInClosure)

[nit] I think all of the above logic could be pulled out into the optgen match pattern, which we generally prefer over custom logic (though ComputeClosure needs to be added to CustomFuncs)

pkg/sql/opt/xform/groupby_funcs.go line 247 at r1 (raw file):

	// construction. We are just adding back in any ordering columns which overlap
	// with grouping columns in order to generate a better plan.
	disabledRules.Add(int(opt.ReduceGroupingCols))

Instead of doing this, maybe it would be better to just construct the groupby without going through the factory, like you're doing with the limit below? Since, it seems like we just want a copy of the matched expression with a couple grouping columns added.

rharding6373

Impressive! once Drew's comments are addressed.

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @msirek)

msirek

Thanks. I found out any added grouping columns have to be removed from the Aggregations because execbuilder logic expects a given column to be in the grouping columns or the aggregations, but not both. I won't merge for a day or so in case you have any more comments.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball)

pkg/sql/opt/xform/groupby_funcs.go line 201 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] These three conditions are all guaranteed by the optgen code right? I think it's ok to remove them. Also, it should be possible to change the type of limitRel to *memo.LimitExpr to avoid the assertion.

Made the changes, except it's not guaranteed that there is an ordering in the limit expression, so I left that check in.

pkg/sql/opt/xform/groupby_funcs.go line 220 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] I think all of the above logic could be pulled out into the optgen match pattern, which we generally prefer over custom logic (though ComputeClosure needs to be added to CustomFuncs)

Added function GroupingColsClosureOverlappingOrdering, called from the match pattern.

pkg/sql/opt/xform/groupby_funcs.go line 247 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Instead of doing this, maybe it would be better to just construct the groupby without going through the factory, like you're doing with the limit below? Since, it seems like we just want a copy of the matched expression with a couple grouping columns added.

I discovered previously this won't work. The input to LimitExpr is already marked as fullyOptimized, so adding a new GroupByExpr to that memo group has no effect because we just end up picking the previously found best-cost expression instead of the new expression. I think this is the reason why we always see the pattern of constructing everything below the top-level expression, and only that top expression is added to a memo group because that group is actively being explored.

DrewKimball

Thanks!

Reviewed all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek)

pkg/sql/opt/xform/groupby_funcs.go line 247 at r1 (raw file):

Previously, msirek (Mark Sirek) wrote…

I discovered previously this won't work. The input to LimitExpr is already marked as fullyOptimized, so adding a new GroupByExpr to that memo group has no effect because we just end up picking the previously found best-cost expression instead of the new expression. I think this is the reason why we always see the pattern of constructing everything below the top-level expression, and only that top expression is added to a memo group because that group is actively being explored.

I see, that makes sense. Thanks for explaining.

msirek

TFTRs!
bors r=DrewKimball,rharding6373

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)

craig · 2022-12-20T18:42:34Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2022-12-20T18:47:56Z

Canceled.

Fixes cockroachdb#93410 A query with a grouped aggregation, a LIMIT and an ORDER BY may not always explore the best-cost query plan. Due to the existence of unique constraints on a table, the set of grouping columns may be reduced during normalization via rule ReduceGroupingCols such that it no longer includes columns present in the ORDER BY clause. This eliminates possibly cheap plans from consideration, for example, if the input to the aggregation is a lookup join, it may be cheaper to sort the input to the lookup join on the ORDER BY columns if they overlap with the grouping columns, so that a streaming group-by with no TopK operator can be used, and a full scan of the inputs to the join is avoided. This fix adds a new exploration rule which ensures that a grouped aggregation with a LIMIT and ORDER BY clause considers using streaming group-by with no TopK when possible. Release note (bug fix): This patch fixes join queries involving tables with unique constraints using LIMIT, GROUP BY and ORDER BY clauses to ensure the optimizer considers streaming group-by with no TopK operation, when possible. This is often the most efficient query plan.

mgartner

Very cool! Sorry for the late drive-by - I left a few optional nits.

Reviewed 1 of 7 files at r2, 1 of 2 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @msirek)

pkg/sql/opt/xform/groupby_funcs.go line 221 at r4 (raw file):

// `GroupingColsClosureOverlappingOrdering`, which also produces the
// `newOrdering`. Argument `private` is expected to be a canonical group-by.
func (c *CustomFuncs) GenerateStreamingGroupByLimitOrderingHint(

nit: "hint" may be confusing because hints usually mean explicity query hints - what is the intentional meaning here?

pkg/sql/opt/xform/groupby_funcs.go line 281 at r4 (raw file):

			).(memo.RelExpr)
	}
	var disabledRules intsets.Fast

nit: FYI, you can create this set in one line with intsets.MakeFast(int(opt.ReduceGroupingCols))

pkg/sql/opt/xform/rules/limit.opt line 190 at r4 (raw file):

# GenerateStreamingGroupByLimitOrderingHint generates streaming group-by and
# distinct-on aggregations with an ordering matching the ordering specified in
# the Limit Op. The goal is to eliminate the need for a TopK operation.

nit: an example here might be helpful, for example a diagram showing a before and after subtree. It's also not obvious to me why eliminating the TopK is a good thing - it's for cases when a plan with a TopK is actually more expensive than a Sort, is that correct?

pkg/sql/opt/xform/testdata/rules/limit line 2582 at r4 (raw file):

           └── a:1

# Regression Test for #93410

nit: I'd label and organize this as we do for test of other rules, since it's not really a regression - it's a new feature. Like:

# --------------------------------------------------
# GenerateStreamingGroupByLimitOrderingHint
# --------------------------------------------------

pkg/sql/opt/xform/testdata/rules/limit line 2607 at r4 (raw file):

        INDEX t93410_col1_col2_col13_col15_idx (col1 ASC, col2 ASC, col13 ASC, col15 ASC),
        UNIQUE INDEX t93410_col1_col2_col5_col9_key (col1 ASC, col2 ASC, col5 ASC, col9 ASC),
        UNIQUE INDEX t93410_col1_col2_col5_key (col1 ASC, col2 ASC, col5 ASC)

It's a lot of work to simplify these test cases that were derived from TPC benchmarks, but we might be thankful for that later when we come along and update the rule. One strategy that can be effective is to write minimal "unit-y" tests in xform tests that exercise all the code paths of the rule, and then a "regression" test as an execbuilder test to ensure that the specific case you're trying to fix is covered. I know I'm being nitpicky here, so up to you if you want to do that or not.

pkg/sql/opt/xform/testdata/rules/limit line 3046 at r4 (raw file):

      └── 20

# End Regression Test for #93410

Is it possible to add tests to test the other non-matching cases of the rule, like a negative LIMIT, or when the new aggregate output columns don't match the LIMIT's order by columns?

msirek · 2022-12-20T21:31:34Z

bors retry

craig · 2022-12-20T23:01:29Z

Build succeeded:

Bazel Essential CI (Cockroach)

msirek · 2022-12-22T18:08:15Z

If this PR is backported, be sure to backport #94112 as well.

ikawalec · 2022-12-27T14:28:33Z

@msirek in which version this fix will be available?

mgartner · 2022-12-27T15:16:45Z

@ikawalec This improvement will be included in v23.1.0 which is scheduled to be release in May.

msirek · 2023-01-06T20:40:27Z

@ikawalec The soonest this fix will be available is towards the end of January in version 22.2.3.
CRDB Releases
Note, the fix will ship disabled, so to enable it, the following session setting can be turned on:

SET optimizer_use_limit_ordering_for_streaming_group_by = true;

The setting may also be enabled for all users in a given ROLE via ALTER ROLE

ikawalec · 2023-01-09T12:26:02Z

@ikawalec The soonest this fix will be available is towards the end of January in version 22.2.3. CRDB Releases Note, the fix will ship disabled, so to enable it, the following session setting can be turned on:
SET optimizer_use_limit_ordering_for_streaming_group_by = true;
The setting may also be enabled for all users in a given ROLE via ALTER ROLE

Thanks for the update @msirek

I will give it a try, once it's released.

msirek requested a review from a team as a code owner December 18, 2022 05:48

msirek requested review from rharding6373 and DrewKimball December 18, 2022 05:50

msirek force-pushed the extraGroupByCols branch from 626b197 to 2dc0c0e Compare December 18, 2022 05:57

msirek force-pushed the extraGroupByCols branch from 2dc0c0e to a2875b3 Compare December 18, 2022 20:54

DrewKimball reviewed Dec 18, 2022

View reviewed changes

msirek commented Dec 19, 2022

View reviewed changes

DrewKimball reviewed Dec 19, 2022

View reviewed changes

msirek commented Dec 19, 2022

View reviewed changes

DrewKimball approved these changes Dec 19, 2022

View reviewed changes

rharding6373 approved these changes Dec 19, 2022

View reviewed changes

msirek force-pushed the extraGroupByCols branch from a2875b3 to dbfdba1 Compare December 20, 2022 05:40

msirek commented Dec 20, 2022

View reviewed changes

DrewKimball approved these changes Dec 20, 2022

View reviewed changes

msirek commented Dec 20, 2022

View reviewed changes

msirek force-pushed the extraGroupByCols branch from dbfdba1 to fdaaadd Compare December 20, 2022 18:47

msirek force-pushed the extraGroupByCols branch from fdaaadd to 04eb457 Compare December 20, 2022 20:22

mgartner self-requested a review December 20, 2022 20:57

mgartner reviewed Dec 20, 2022

View reviewed changes

craig bot merged commit 1627fdf into cockroachdb:master Dec 20, 2022

andyyang890 mentioned this pull request Dec 21, 2022

roachtest: activerecord failed #94042

Closed

This was referenced Dec 29, 2022

release-22.2: main: update optgen Op alias to match current memo group member #94423

Merged

release-22.2: xform: use ordering from LIMIT as a hint for streaming group-by #94603

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xform: use ordering from LIMIT as a hint for streaming group-by #93858

xform: use ordering from LIMIT as a hint for streaming group-by #93858

msirek commented Dec 18, 2022

cockroach-teamcity commented Dec 18, 2022

msirek commented Dec 18, 2022

DrewKimball left a comment

msirek left a comment

DrewKimball left a comment

msirek left a comment

DrewKimball left a comment

rharding6373 left a comment

msirek left a comment

DrewKimball left a comment

msirek left a comment

craig bot commented Dec 20, 2022

craig bot commented Dec 20, 2022

mgartner left a comment

msirek commented Dec 20, 2022

craig bot commented Dec 20, 2022

msirek commented Dec 22, 2022

ikawalec commented Dec 27, 2022

mgartner commented Dec 27, 2022

msirek commented Jan 6, 2023

ikawalec commented Jan 9, 2023 •

edited

Loading

xform: use ordering from LIMIT as a hint for streaming group-by #93858

xform: use ordering from LIMIT as a hint for streaming group-by #93858

Conversation

msirek commented Dec 18, 2022

cockroach-teamcity commented Dec 18, 2022

msirek commented Dec 18, 2022

DrewKimball left a comment

Choose a reason for hiding this comment

msirek left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

msirek left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

msirek left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

msirek left a comment

Choose a reason for hiding this comment

craig bot commented Dec 20, 2022

craig bot commented Dec 20, 2022

mgartner left a comment

Choose a reason for hiding this comment

msirek commented Dec 20, 2022

craig bot commented Dec 20, 2022

msirek commented Dec 22, 2022

ikawalec commented Dec 27, 2022

mgartner commented Dec 27, 2022

msirek commented Jan 6, 2023

ikawalec commented Jan 9, 2023 • edited Loading

ikawalec commented Jan 9, 2023 •

edited

Loading