Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt, sql: fix type inference of TypeCheck for subqueries #37578

Merged
merged 1 commit into from
May 20, 2019

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented May 17, 2019

Prior to this commit, the optimizer was not correctly inferring the types of
columns in subqueries for expressions of the form scalar IN (subquery).
This was due to two problems which have now been fixed:

  1. The subquery was built as a relational expression before the desired types
    were known. Now the subquery build is delayed until TypeCheck is called for
    the first time.

  2. For subqueries on the right side of an IN expression, the desired type
    passed into TypeCheck was AnyTuple. This resulted in an error later on in
    typeCheckSubqueryWithIn, which checks to make sure the type of the subquery
    is tuple{T} where T is the type of the left expression. Now the desired
    type passed into TypeCheck is tuple{T}.

Note that this commit only fixes type inference for the optimizer. It is still
broken in the heuristic planner.

Fixes #37263
Fixes #14554

Release note (bug fix): Fixed type inference of columns in subqueries for
some expressions of the form scalar IN (subquery).

@rytaft rytaft requested a review from a team as a code owner May 17, 2019 21:30
@rytaft rytaft requested a review from a team May 17, 2019 21:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix! :lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, @RaduBerinde, and @rytaft)


pkg/sql/logictest/testdata/logic_test/subquery-opt, line 21 at r1 (raw file):

query ITIIIIIIT
SELECT t.oid, t.typname, t.typsend, t.typreceive, t.typoutput, t.typinput,
	       t.typelem, coalesce(r.rngsubtype, 0), ARRAY (

Is this array needed? (I expect the problem to come from the array below?)


pkg/sql/logictest/testdata/logic_test/subquery-opt, line 28 at r1 (raw file):

	)
	FROM pg_type AS t
	LEFT JOIN pg_range AS r ON r.rngtypid = t.oid

do we need this join to repro?


pkg/sql/opt/optbuilder/subquery.go, line 61 at r1 (raw file):

	// number of columns and is used when the normal type checking machinery will
	// verify that the correct number of columns is returned.
	desiredColumns int

[nit] numDesiredColumns or desiredNumColumns


pkg/sql/sem/tree/type_check.go, line 1740 at r1 (raw file):

		desired := types.MakeTuple([]types.T{*typ})
		typedRight, err := foldedRight.TypeCheck(ctx, desired)
		if switched {

How is this possible with IN? (if it was, seems like we should be doing this after the call below)

Prior to this commit, the optimizer was not correctly inferring the types of
columns in subqueries for expressions of the form `scalar IN (subquery)`.
This was due to two problems which have now been fixed:

1. The subquery was built as a relational expression before the desired types
   were known. Now the subquery build is delayed until TypeCheck is called for
   the first time.

2. For subqueries on the right side of an IN expression, the desired type
   passed into TypeCheck was AnyTuple. This resulted in an error later on in
   typeCheckSubqueryWithIn, which checks to make sure the type of the subquery
   is tuple{T} where T is the type of the left expression. Now the desired
   type passed into TypeCheck is tuple{T}.

Note that this commit only fixes type inference for the optimizer. It is still
broken in the heuristic planner.

Fixes cockroachdb#37263
Fixes cockroachdb#14554

Release note (bug fix): Fixed type inference of columns in subqueries for
some expressions of the form `scalar IN (subquery)`.
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, and @RaduBerinde)


pkg/sql/logictest/testdata/logic_test/subquery-opt, line 21 at r1 (raw file):

Previously, RaduBerinde wrote…

Is this array needed? (I expect the problem to come from the array below?)

Nope, I had just copied the query directly from the issue. Removed.


pkg/sql/logictest/testdata/logic_test/subquery-opt, line 28 at r1 (raw file):

Previously, RaduBerinde wrote…

do we need this join to repro?

Nope, removed.


pkg/sql/opt/optbuilder/subquery.go, line 61 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] numDesiredColumns or desiredNumColumns

Done.


pkg/sql/sem/tree/type_check.go, line 1740 at r1 (raw file):

Previously, RaduBerinde wrote…

How is this possible with IN? (if it was, seems like we should be doing this after the call below)

Good point - it's not possible. Removed.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @justinj)

@rytaft
Copy link
Collaborator Author

rytaft commented May 20, 2019

bors r+

craig bot pushed a commit that referenced this pull request May 20, 2019
37506: storage: reconcile manual splitting with automatic merging r=jeffrey-xiao a=jeffrey-xiao

Follows the steps outlined in #37487 to reconcile manual splitting with automatic merging. This PR includes the changes needed at the storage layer. The sticky bit indicating that a range is manually split is added to the range descriptor.

37558: docs/tla-plus: add timestamp refreshes to ParallelCommits spec r=nvanbenschoten a=nvanbenschoten

This commit adds transaction timestamp refreshes to the PlusCal model
for parallel commits. This allows the committing transaction to bump
its timestamp without a required epoch bump.

This completes the parallel commits formal specification.

37578: opt, sql: fix type inference of TypeCheck for subqueries r=rytaft a=rytaft

Prior to this commit, the optimizer was not correctly inferring the types of
columns in subqueries for expressions of the form `scalar IN (subquery)`.
This was due to two problems which have now been fixed:

1. The subquery was built as a relational expression before the desired types
   were known. Now the subquery build is delayed until `TypeCheck` is called for
   the first time.

2. For subqueries on the right side of an `IN` expression, the desired type
   passed into `TypeCheck` was `AnyTuple`. This resulted in an error later on in
   `typeCheckSubqueryWithIn`, which checks to make sure the type of the subquery
   is `tuple{T}` where `T` is the type of the left expression. Now the desired
   type passed into `TypeCheck` is `tuple{T}`.

Note that this commit only fixes type inference for the optimizer. It is still
broken in the heuristic planner.

Fixes #37263
Fixes #14554

Release note (bug fix): Fixed type inference of columns in subqueries for
some expressions of the form `scalar IN (subquery)`.

Co-authored-by: Jeffrey Xiao <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 20, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants