sql: add the ability to access a column from the results of an SRF #24832

BramGruneir · 2018-04-16T17:50:01Z

sql: add the ability to access a column from the results of an SRF

To be compatible with postgres, specifically to be able to handle #16971
correctly. There is a need to be able to access the results of a Set Returning
Function (SRF) using a . accessor.

This commit does 3 things.
Firstly, it adds the appropriate grammer.

Secondly, it rewrites the rewriteSRF function to handle named column accessors
correctly.

Before this change, the only way to decompose an SRF into columns was to set it
it as the data source in the FROM clause, like so:

SELECT * FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']);
+---+---+
| x | n |
+---+---+
| c | 1 |
| b | 2 |
| a | 3 |
+---+---+

SELECT x FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']);
+---+
| x |
+---+
| c |
| b |
| a |
+---+

SELECT n FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']);
+---+
| n |
+---+
| 1 |
| 2 |
| 3 |
+---+

With this commit, the following are now valid syntax:

SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).*;
+---+---+
| x | n |
+---+---+
| c | 1 |
| b | 2 |
| a | 3 |
+---+---+

SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x;
+---+
| x |
+---+
| c |
| b |
| a |
+---+

SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).n;
+---+
| n |
+---+
| 1 |
| 2 |
| 3 |
+---+

Thirdly, this changes the behaviour of the resulting columns from SRFs. If
there is more than one named element in the resulting tuple, the column for the
tuple will now be named for all elements:

SELECT information_schema._pg_expandarray(ARRAY[3, 2, 1])

Will produce a tuple column like named
("information_schema._pg_expandarray".x, "information_schema._pg_expandarray".n).

The bahaviour does deviate from postgres, which simply names the column after
the name of the function, in this case _pg_expandarray.

Here's another example of this difference:

SELECT 'a', pg_get_keywords(), 'c' LIMIT 1;

In Postgres:
 ?column? |   pg_get_keywords    | ?column?
----------+----------------------+----------
 a        | (abort,U,unreserved) | c

In Cockroach:
+-----+--------------------------------+-----+
| 'a' |     (pg_get_keywords.word,     | 'c' |
|     |    pg_get_keywords.catcode,    |     |
|     |    pg_get_keywords.catdesc)    |     |
+-----+--------------------------------+-----+
| a   | ('abort','U','unreserved')     | c   |
+-----+--------------------------------+-----+

Release note (sql change): Set Returning Functions (SRF) can now be accessed
using (SRF).x where x is the name of a column returned form the SRF or a
*. For example,
SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x and
SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).* are both
now valid syntax.
Also, the naming of the resulting columns from SRFs has been updated to provide
more information about the resulting tuple.

Co-authored-by: Raphael 'kena' Poss [email protected]

cockroach-teamcity · 2018-04-16T17:50:09Z

This change is

BramGruneir · 2018-04-16T17:51:20Z

@knz Here is my WIP I was talking about. Take a look at the final test case the new logic tests. The one that is commented out. This is what I still need to address.

knz · 2018-04-17T15:30:27Z

This is good work and a clear step forward.

Also this PR has helped us (at least me) understand/uncover the additional semantics described in #24866, which suggests there is more work needed before fully solving #16971. I recommend not going there for the time being, and focusing on one step at a time (this one).

Reviewed 10 of 10 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.

pkg/sql/render.go, line 554 at r1 (raw file):

func (v *srfExtractionVisitor) VisitPost(expr tree.Expr) tree.Expr {
	switch t := expr.(type) {
	case *tree.ColumnAccessExpr:

I fear that this logic is incomplete. I have experimented and found an extension of this code that works better. Let's chat about it.

pkg/sql/logictest/testdata/logic_test/generators, line 492 at r1 (raw file):

# This test case should work, but right now it doesn't.
# query I colnames
# SELECT (i.keys).n FROM (SELECT information_schema._pg_expandarray(ARRAY[3,2,1]) AS keys) AS i;

This test is exercising a totally unrelated feature, unfortunately. Please add a TODO and refer to this: #24866

Comments from Reviewable

knz · 2018-04-17T18:01:09Z

Reviewed 6 of 7 files at r2, 2 of 2 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions.

pkg/sql/logictest/testdata/logic_test/generators, line 109 at r3 (raw file):

generator  ·  ·

query error unimplemented: cannot use multiple set-generating functions in a single SELECT clause

maybe add a TODO here with a reference to #20511.

pkg/sql/logictest/testdata/logic_test/generators, line 235 at r3 (raw file):

SELECT 'a', pg_get_keywords(), 'c' LIMIT 1
----
'a'  (pg_get_keywords.word, pg_get_keywords.catcode, pg_get_keywords.catdesc)              'c'

we may or may not want to keep the change in the column label. I haven't yet figured out how to restore the previous behavior.

Comments from Reviewable

BramGruneir · 2018-04-17T18:01:33Z

pkg/sql/logictest/testdata/logic_test/generators, line 235 at r3 (raw file):

Previously, knz (kena) wrote…

we may or may not want to keep the change in the column label. I haven't yet figured out how to restore the previous behavior.

I like this new behaviour.

Comments from Reviewable

knz · 2018-04-17T18:03:09Z

The commit message (and the release note) need to be extended to showcase some examples.

Review status: all files reviewed at latest revision, 4 unresolved discussions.

Comments from Reviewable

BramGruneir · 2018-04-18T15:00:44Z

Ok, this is ready for a proper review. There one leftover failing logic test that I'm looking into.

@jordanlewis you've touched this code in the past, can you take a quick look? I think it would be a good idea to get another set of eyes on it.

@rytaft there is a sql language change here, so I wanted someone on the optimizer team to take note.

BramGruneir · 2018-04-18T15:02:16Z

I also updated the PR message with the updated commit message.

BramGruneir · 2018-04-18T15:43:53Z

Found and fixed that failing logic test. This PR is ready for a final review.

Review status: 6 of 12 files reviewed at latest revision, 4 unresolved discussions.

pkg/sql/logictest/testdata/logic_test/generators, line 492 at r1 (raw file):

Previously, knz (kena) wrote…

This test is exercising a totally unrelated feature, unfortunately. Please add a TODO and refer to this: #24866

Done.

pkg/sql/logictest/testdata/logic_test/generators, line 109 at r3 (raw file):

Previously, knz (kena) wrote…

maybe add a TODO here with a reference to #20511.

Done.

Comments from Reviewable

rytaft · 2018-04-18T19:58:14Z

@rytaft there is a sql language change here, so I wanted someone on the optimizer team to take note.

👍 thanks! I don't feel like I have enough background on this to give you a good quality review, but I've made a note of this on our spreadsheet of TODOs for the optimizer.

Review status: 6 of 12 files reviewed at latest revision, 4 unresolved discussions.

Comments from Reviewable

BramGruneir · 2018-04-23T18:22:18Z

@jordanlewis Soft ping to please take a look when you have a few mins.

jordanlewis · 2018-04-23T21:52:19Z

this is awesome! happy to see this go in

Reviewed 2 of 10 files at r1, 4 of 7 files at r2, 4 of 5 files at r4, 2 of 2 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions.

pkg/sql/sem/tree/expr.go, line 1446 at r5 (raw file):

}

// ColumnAccessExpr *** more here

^^^ needs attention

Comments from Reviewable

To be compatible with postgres, specifically to be able to handle cockroachdb#16971 correctly. There is a need to be able to access the results of a Set Returning Function (SRF) using a `.` accessor. This commit does 3 things. Firstly, it adds the appropriate grammer. Secondly, it rewrites the rewriteSRF function to handle named column accessors correctly. Before this change, the only way to decompose an SRF into columns was to set it it as the data source in the FROM clause, like so: ```sql SELECT * FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+---+ | x | n | +---+---+ | c | 1 | | b | 2 | | a | 3 | +---+---+ SELECT x FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+ | x | +---+ | c | | b | | a | +---+ SELECT n FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+ | n | +---+ | 1 | | 2 | | 3 | +---+ ``` With this commit, the following are now valid syntax: ```sql SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).*; +---+---+ | x | n | +---+---+ | c | 1 | | b | 2 | | a | 3 | +---+---+ SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x; +---+ | x | +---+ | c | | b | | a | +---+ SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).n; +---+ | n | +---+ | 1 | | 2 | | 3 | +---+ ``` Thirdly, this changes the behaviour of the resulting columns from SRFs. If there is more than one named element in the resulting tuple, the column for the tuple will now be named for all elements: ```sql SELECT information_schema._pg_expandarray(ARRAY[3, 2, 1]) ``` Will produce a tuple column like named `("information_schema._pg_expandarray".x, "information_schema._pg_expandarray".n)`. The bahaviour does deviate from postgres, which simply names the column after the name of the function, in this case `_pg_expandarray`. Here's another example of this difference: ```sql SELECT 'a', pg_get_keywords(), 'c' LIMIT 1; In Postgres: ?column? | pg_get_keywords | ?column? ----------+----------------------+---------- a | (abort,U,unreserved) | c In Cockroach: +-----+--------------------------------+-----+ | 'a' | (pg_get_keywords.word, | 'c' | | | pg_get_keywords.catcode, | | | | pg_get_keywords.catdesc) | | +-----+--------------------------------+-----+ | a | ('abort','U','unreserved') | c | +-----+--------------------------------+-----+ ``` Release note (sql change): Set Returning Functions (SRF) can now be accessed using `(SRF).x` where `x` is the name of a column returned form the SRF or a `*`. For example, `SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x` and `SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).*` are both now valid syntax. Also, the naming of the resulting columns from SRFs has been updated to provide more information about the resulting tuple. Co-authored-by: Raphael 'kena' Poss <[email protected]>

BramGruneir · 2018-04-24T20:30:32Z

Review status: 4 of 12 files reviewed at latest revision, 5 unresolved discussions.

pkg/sql/sem/tree/expr.go, line 1446 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

^^^ needs attention

Done.

Comments from Reviewable

BramGruneir · 2018-04-24T20:30:57Z

bors +r

craig · 2018-04-24T20:30:58Z

Did you mean "r+"?

BramGruneir · 2018-04-24T20:31:18Z

bors r+

24832: sql: add the ability to access a column from the results of an SRF r=BramGruneir a=BramGruneir sql: add the ability to access a column from the results of an SRF To be compatible with postgres, specifically to be able to handle #16971 correctly. There is a need to be able to access the results of a Set Returning Function (SRF) using a `.` accessor. This commit does 3 things. Firstly, it adds the appropriate grammer. Secondly, it rewrites the rewriteSRF function to handle named column accessors correctly. Before this change, the only way to decompose an SRF into columns was to set it it as the data source in the FROM clause, like so: ```sql SELECT * FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+---+ | x | n | +---+---+ | c | 1 | | b | 2 | | a | 3 | +---+---+ SELECT x FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+ | x | +---+ | c | | b | | a | +---+ SELECT n FROM information_schema._pg_expandarray(ARRAY['c', 'b', 'a']); +---+ | n | +---+ | 1 | | 2 | | 3 | +---+ ``` With this commit, the following are now valid syntax: ```sql SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).*; +---+---+ | x | n | +---+---+ | c | 1 | | b | 2 | | a | 3 | +---+---+ SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x; +---+ | x | +---+ | c | | b | | a | +---+ SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).n; +---+ | n | +---+ | 1 | | 2 | | 3 | +---+ ``` Thirdly, this changes the behaviour of the resulting columns from SRFs. If there is more than one named element in the resulting tuple, the column for the tuple will now be named for all elements: ```sql SELECT information_schema._pg_expandarray(ARRAY[3, 2, 1]) ``` Will produce a tuple column like named `("information_schema._pg_expandarray".x, "information_schema._pg_expandarray".n)`. The bahaviour does deviate from postgres, which simply names the column after the name of the function, in this case `_pg_expandarray`. Here's another example of this difference: ```sql SELECT 'a', pg_get_keywords(), 'c' LIMIT 1; In Postgres: ?column? | pg_get_keywords | ?column? ----------+----------------------+---------- a | (abort,U,unreserved) | c In Cockroach: +-----+--------------------------------+-----+ | 'a' | (pg_get_keywords.word, | 'c' | | | pg_get_keywords.catcode, | | | | pg_get_keywords.catdesc) | | +-----+--------------------------------+-----+ | a | ('abort','U','unreserved') | c | +-----+--------------------------------+-----+ ``` Release note (sql change): Set Returning Functions (SRF) can now be accessed using `(SRF).x` where `x` is the name of a column returned form the SRF or a `*`. For example, `SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).x` and `SELECT (information_schema._pg_expandarray(ARRAY['c', 'b', 'a'])).*` are both now valid syntax. Also, the naming of the resulting columns from SRFs has been updated to provide more information about the resulting tuple. Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Bram Gruneir <[email protected]>

craig · 2018-04-24T20:51:34Z

Build succeeded

GitHub CI (Cockroach)

BramGruneir requested review from a team April 16, 2018 17:50

BramGruneir added the do-not-merge bors won't merge a PR with this label. label Apr 16, 2018

knz mentioned this pull request Apr 17, 2018

sql: support lightweight composite types for in-flight values #24866

Closed

knz force-pushed the rewriteSRF branch 3 times, most recently from b33daac to 2412bcb Compare April 17, 2018 17:55

justinj mentioned this pull request Apr 17, 2018

sql: support Hibernate's use of _pg_expandarray #16971

Closed

11 tasks

knz force-pushed the rewriteSRF branch from 2412bcb to 148cb36 Compare April 17, 2018 18:38

BramGruneir force-pushed the rewriteSRF branch 2 times, most recently from d34b745 to 2b4493e Compare April 18, 2018 14:50

BramGruneir requested review from jordanlewis and rytaft April 18, 2018 14:59

BramGruneir removed the do-not-merge bors won't merge a PR with this label. label Apr 18, 2018

BramGruneir changed the title ~~[wip] sql: add the ability to access a column from the results of an SRF~~ sql: add the ability to access a column from the results of an SRF Apr 18, 2018

BramGruneir added this to the 2.1 milestone Apr 18, 2018

BramGruneir force-pushed the rewriteSRF branch from 2b4493e to 2d80202 Compare April 18, 2018 15:17

BramGruneir force-pushed the rewriteSRF branch from 2d80202 to fef4d74 Compare April 24, 2018 20:29

craig bot merged commit fef4d74 into cockroachdb:master Apr 24, 2018

BramGruneir deleted the rewriteSRF branch April 25, 2018 15:33

jseldess mentioned this pull request May 3, 2018

sql: add the ability to access a column from the results of an SRF cockroachdb/docs#3070

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: add the ability to access a column from the results of an SRF #24832

sql: add the ability to access a column from the results of an SRF #24832

BramGruneir commented Apr 16, 2018 •

edited

Loading

cockroach-teamcity commented Apr 16, 2018

BramGruneir commented Apr 16, 2018

knz commented Apr 17, 2018

knz commented Apr 17, 2018

BramGruneir commented Apr 17, 2018

knz commented Apr 17, 2018

BramGruneir commented Apr 18, 2018

BramGruneir commented Apr 18, 2018

BramGruneir commented Apr 18, 2018

rytaft commented Apr 18, 2018

BramGruneir commented Apr 23, 2018

jordanlewis commented Apr 23, 2018

BramGruneir commented Apr 24, 2018

BramGruneir commented Apr 24, 2018

craig bot commented Apr 24, 2018

BramGruneir commented Apr 24, 2018

craig bot commented Apr 24, 2018

sql: add the ability to access a column from the results of an SRF #24832

sql: add the ability to access a column from the results of an SRF #24832

Conversation

BramGruneir commented Apr 16, 2018 • edited Loading

cockroach-teamcity commented Apr 16, 2018

BramGruneir commented Apr 16, 2018

knz commented Apr 17, 2018

knz commented Apr 17, 2018

BramGruneir commented Apr 17, 2018

knz commented Apr 17, 2018

BramGruneir commented Apr 18, 2018

BramGruneir commented Apr 18, 2018

BramGruneir commented Apr 18, 2018

rytaft commented Apr 18, 2018

BramGruneir commented Apr 23, 2018

jordanlewis commented Apr 23, 2018

BramGruneir commented Apr 24, 2018

BramGruneir commented Apr 24, 2018

craig bot commented Apr 24, 2018

BramGruneir commented Apr 24, 2018

craig bot commented Apr 24, 2018

Build succeeded

BramGruneir commented Apr 16, 2018 •

edited

Loading