Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix the subquery limit optimization #34801

Merged
merged 1 commit into from
Feb 13, 2019

Conversation

knz
Copy link
Contributor

@knz knz commented Feb 11, 2019

Found this while investigating #32054.

A while ago the HP was equipped with an optimization: when a subquery
is planned for EXISTS or "max 1 row" (scalar context), a LIMIT is
applied on its data source. This ensures that the data source does not
fetch more rows than strictly necessary to determine the subquery
result:

  • for EXISTS, only 0 or 1 row are needed to decide the boolean;
  • for scalar contexts, only 0, 1 or 2 rows are needed to decide the
    outcome 0 or 2 yield an error, only 1 gets a valid result.

This optimization was temporarily broken for the scalar case when
max1row was introduced (when local exec was subsumed by distsql),
because the limit was remaining "on top" of max1row and not
propagated down. This patch places it "under" so it gets propagated
again.

Release note (performance improvement): subqueries used with EXISTS or
as a scalar value now avoid fetching more rows than needed to decide
the outcome.

@knz knz requested review from a team February 11, 2019 22:19
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor Author

knz commented Feb 11, 2019

@RaduBerinde @andy-kimball can you make a suggestion as to how/where I can introduce the same limit in the CBO? I kinda half-guess this can be introduced super-early (as soon as optbuilder). Is my guess correct?

@RaduBerinde
Copy link
Member

Yeah, I would try to do it in optbuilder, right before ConstructMax1Row

@knz knz force-pushed the 20190211-sq-limit branch from 0f248e8 to 58c4cac Compare February 12, 2019 00:29
@knz knz requested a review from a team as a code owner February 12, 2019 00:29
@knz knz force-pushed the 20190211-sq-limit branch from 58c4cac to eb29d13 Compare February 12, 2019 10:19
@knz
Copy link
Contributor Author

knz commented Feb 12, 2019

@RaduBerinde @andy-kimball I have introduced the rules, with a property to prevent double application as you recommended.

I am happy to see that the limit propagates, however I found this interesting case:

CREATE TABLE abc (a INT PRIMARY KEY, b INT, c INT);

EXPLAIN (VERBOSE) SELECT * FROM abc WHERE a = (SELECT max(a) FROM abc WHERE EXISTS(SELECT * FROM abc WHERE c=a+3));
----
[...]
 ├── subquery        ·             ·                                                                            (a, b, c)  ·
 │    │              id            @S1                                                                          ·          ·
 │    │              original sql  EXISTS (SELECT * FROM abc WHERE c = (a + 3))                                 ·          ·
 │    │              exec mode     exists                                                                       ·          ·
 │    └── limit      ·             ·                                                                            (a, b, c)  ·
 │         │         count         1                                                                            ·          ·
 │         └── scan  ·             ·                                                                            (a, b, c)  ·
 │                   table         abc@primary                                                                  ·          ·
 │                   spans         ALL                                                                          ·          ·
 │                   filter        c = (a + 3)                                                                  ·          ·
[...]

Notice how the limit does not get propagated down in that case.

I think the reason is the ordering of the rules:

  1. the initial form of the tree is exists -> select -> scan
  2. the new rule applies, and a limit gets introduced: exists -> limit -> select -> scan
  3. the limit elimination rules apply, but in the general case a limit cannot be pushed down through a select
  4. the select elimination rules apply (filter push down)
  5. the select is eliminated, and the limit becomes adjacent to a scan, but at that point the limit rules have already ran and are not run again.

Of course I can't simply swap the select and limit elimination rules, because then we get in the opposite problem.

What do you think? Would this warrant moving the limit and select elimination to xforms?

@knz knz force-pushed the 20190211-sq-limit branch from eb29d13 to 14cce3d Compare February 12, 2019 10:35
@RaduBerinde
Copy link
Member

There are existing cases like that already (see limit file in execbuilder testdata). The reason is that opt doesn't have a scan+select operator. The filter gets pushed in the scannode during execbuild. We could probably have special code to push the limit in too, but there is little benefit - it's all the same when it gets converted to distsql.

@knz
Copy link
Contributor Author

knz commented Feb 12, 2019

We could probably have special code to push the limit in too, but there is little benefit - it's all the same when it gets converted to distsql.

does the physical planner also push limits down to scans?

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, and @RaduBerinde)


pkg/sql/opt/norm/custom_funcs.go, line 1256 at r1 (raw file):

// applying twice.
func (c *CustomFuncs) SetSubqueryLimited(sub *memo.SubqueryPrivate) *memo.SubqueryPrivate {
	sub.SubqueryLimited = true

We never mutate private structs once they've been constructed, because it interferes with interning. It's similar to updating an object's key after putting it into a hash table. opt trees must always be immutable after construction.

Instead, follow this pattern (used elsewhere):

func (c *CustomFuncs) MakeLimitedSubquery(sub *memo.SubqueryPrivate) *memo.SubqueryPrivate {
	newSub := *sub
	newSub.SubqueryLimited = true
	return &newSub
}

pkg/sql/opt/norm/rules/limit.opt, line 16 at r1 (raw file):

# operator is supported - in that case it can be definitely worthwhile
# pushing down a LIMIT 2 to limit the amount of work done on every row.)
[Max1RowLimitScan, Normalize]

I think this rule is unnecessary and we should remove. If there is more than 1 row, then it's an error. So there's no need to optimize that code path. In the non-error case, the returned result will always have at most 1 row. Adding a Limit may even slow down execution a bit.


pkg/sql/opt/norm/rules/scalar.opt, line 178 at r1 (raw file):

# operator is supported - in that case it can be definitely worthwhile
# pushing down a LIMIT 1 to limit the amount of work done on every row.)
[ExistsLimit, Normalize]

NIT: Call this rule IntroduceExistsLimit, or similar, since all our rule names begin with a verb.

Also, can you add commentary explaining how/why you're marking the private to avoid double pushdown?


pkg/sql/opt/norm/rules/scalar.opt, line 183 at r1 (raw file):

    $subqueryPrivate:* & ^(IsSubqueryLimited $subqueryPrivate))
=>
((OpName)

NIT: Our convention is to only use (OpName) when there's more than one choice. So this should just be Exists.


pkg/sql/opt/ops/scalar.opt, line 40 at r1 (raw file):

[Private]
define SubqueryPrivate {
	OriginalExpr Subquery

NIT: Do you mind converting spaces to tabs in this declaration since you're touching this anyway?


pkg/sql/opt/ops/scalar.opt, line 51 at r1 (raw file):

	Cmp Operator

    SubqueryLimited bool

This could use comment, since it's not obvious what it's for. This establishes a new pattern that we want to have documented.

NIT: Call this WasLimited, and we can use that naming as a convention if we follow this pattern elsewhere.

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diving into this. I'm coming around to the idea of using the "mark bit" to avoid double pushdown. In more complex cases, RuleProps are needed instead, but in simple cases like this where we always want to push down, and always want to do that exactly once, then this is easy way to do that.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, and @RaduBerinde)

@RaduBerinde
Copy link
Member

does the physical planner also push limits down to scans?

The physical planner effectively pushes limits into any processor (all processors support Limit as part of post-processing).

@knz knz force-pushed the 20190211-sq-limit branch from 14cce3d to 9ae1253 Compare February 12, 2019 16:11
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, and @RaduBerinde)


pkg/sql/opt/norm/custom_funcs.go, line 1256 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

We never mutate private structs once they've been constructed, because it interferes with interning. It's similar to updating an object's key after putting it into a hash table. opt trees must always be immutable after construction.

Instead, follow this pattern (used elsewhere):

func (c *CustomFuncs) MakeLimitedSubquery(sub *memo.SubqueryPrivate) *memo.SubqueryPrivate {
	newSub := *sub
	newSub.SubqueryLimited = true
	return &newSub
}

Done.


pkg/sql/opt/norm/rules/limit.opt, line 16 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I think this rule is unnecessary and we should remove. If there is more than 1 row, then it's an error. So there's no need to optimize that code path. In the non-error case, the returned result will always have at most 1 row. Adding a Limit may even slow down execution a bit.

Done.


pkg/sql/opt/norm/rules/scalar.opt, line 183 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: Our convention is to only use (OpName) when there's more than one choice. So this should just be Exists.

Done.


pkg/sql/opt/ops/scalar.opt, line 40 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: Do you mind converting spaces to tabs in this declaration since you're touching this anyway?

Done.


pkg/sql/opt/ops/scalar.opt, line 51 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

This could use comment, since it's not obvious what it's for. This establishes a new pattern that we want to have documented.

NIT: Call this WasLimited, and we can use that naming as a convention if we follow this pattern elsewhere.

Done.

@knz knz force-pushed the 20190211-sq-limit branch 2 times, most recently from 36c9c00 to 71a772e Compare February 12, 2019 16:15
Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, @knz, and @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 16 at r1 (raw file):

Previously, knz (kena) wrote…

Done.

NIT: You can just revert all changes to this file.


pkg/sql/opt/norm/rules/scalar.opt, line 183 at r1 (raw file):

Previously, knz (kena) wrote…

Done.

NIT: The trailing paren should go on a separate line. Don't worry about this if you're already passing CI, though.


pkg/sql/opt/ops/scalar.opt, line 51 at r1 (raw file):

Previously, knz (kena) wrote…

Done.

NIT: ndicates => indicates

@knz knz force-pushed the 20190211-sq-limit branch from 71a772e to eab6b3f Compare February 13, 2019 09:45
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, and @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 16 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: You can just revert all changes to this file.

Done.


pkg/sql/opt/ops/scalar.opt, line 51 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: ndicates => indicates

Done.

@knz knz force-pushed the 20190211-sq-limit branch from eab6b3f to 74c4d7d Compare February 13, 2019 09:47
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @jordanlewis, and @RaduBerinde)


pkg/sql/opt/norm/rules/scalar.opt, line 183 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: The trailing paren should go on a separate line. Don't worry about this if you're already passing CI, though.

Done.

@knz knz force-pushed the 20190211-sq-limit branch from 74c4d7d to c399c88 Compare February 13, 2019 09:48
@knz
Copy link
Contributor Author

knz commented Feb 13, 2019

TFYRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 13, 2019

Build failed

A while ago the HP was equipped with an optimization: when a subquery
is planned for EXISTS or "max 1 row" (scalar context), a LIMIT is
applied on its data source. This ensures that the data source does not
fetch more rows than strictly necessary to determine the subquery
result:

- for EXISTS, only 0 or 1 row are needed to decide the boolean;
- for scalar contexts, only 0, 1 or 2 rows are needed to decide the
  outcome 0 or 2 yield an error, only 1 gets a valid result.

This optimization was temporarily broken for the scalar case when
`max1row` was introduced (when local exec was subsumed by distsql),
because the limit was remaining "on top" of `max1row` and not
propagated down. This patch places it "under" so it gets propagated
again.

Release note (performance improvement): subqueries used with EXISTS or
as a scalar value now avoid fetching more rows than needed to decide
the outcome.
@knz knz force-pushed the 20190211-sq-limit branch from c399c88 to 462d3aa Compare February 13, 2019 10:41
@knz
Copy link
Contributor Author

knz commented Feb 13, 2019

That was a merge skew with a concurrently introduced test. Retrying.

bors r+

craig bot pushed a commit that referenced this pull request Feb 13, 2019
34801: sql: fix the subquery limit optimization r=knz a=knz

Found this while investigating #32054.

A while ago the HP was equipped with an optimization: when a subquery
is planned for EXISTS or "max 1 row" (scalar context), a LIMIT is
applied on its data source. This ensures that the data source does not
fetch more rows than strictly necessary to determine the subquery
result:

- for EXISTS, only 0 or 1 row are needed to decide the boolean;
- for scalar contexts, only 0, 1 or 2 rows are needed to decide the
  outcome 0 or 2 yield an error, only 1 gets a valid result.

This optimization was temporarily broken for the scalar case when
`max1row` was introduced (when local exec was subsumed by distsql),
because the limit was remaining "on top" of `max1row` and not
propagated down. This patch places it "under" so it gets propagated
again.

Release note (performance improvement): subqueries used with EXISTS or
as a scalar value now avoid fetching more rows than needed to decide
the outcome.

Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig
Copy link
Contributor

craig bot commented Feb 13, 2019

Build succeeded

@craig craig bot merged commit 462d3aa into cockroachdb:master Feb 13, 2019
@knz knz deleted the 20190211-sq-limit branch February 14, 2019 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants