Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support lightweight composite types for in-flight values #24866

Closed
knz opened this issue Apr 17, 2018 · 3 comments · Fixed by #26621
Closed

sql: support lightweight composite types for in-flight values #24866

knz opened this issue Apr 17, 2018 · 3 comments · Fixed by #26621
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Milestone

Comments

@knz
Copy link
Contributor

knz commented Apr 17, 2018

Forked off #24832.
Needed for #16971.

The PostgreSQL type system supports "composite types" which are really just tuples of values where the arity is constant (same for every row) and each column in the tuple has a static label (same for every row).

Note that CockroachDB already supports tuples (and tuple types) in scalar expressions but so far their columns/elements were anonymous.

Composite types show up mainly as follows.

Reification of tables as composite values

If an identifier in a scalar expression refers to one of the relational expressions in the current context (e.g. a table name in the FROM clause), then the result of evaluating that identifier in the scalar context is a composite expression, with the relexp's columns as elements.

For example:

> SELECT kv FROM kv;
   kv
---------
 (37,2)
 (94,3)
 (30,3)

> SELECT (kv, kv) AS woo FROM kv;
          woo
-----------------------
 ("(37,2)","(37,2)")
 ("(94,3)","(94,3)")
 ("(30,3)","(30,3)")

Note: in these examples, the result tables have just 1 column. This is a radically different path/concept than e.g. SELECT kv.* FROM kv (star expansion) which produces as many columns as the FROM table.

Implementation wise, the identifier to the relexpr must be replaced by a suitable tuple constructor, e.g. ROW(...) with some extended syntax to define labels in the type.

Access to tuple columns in scalar contexts

If a scalar expression E has composite type (..., "a", ...) (tuple with at least some column label "a"), then the expression (E).a is valid and refers to the element labeled "a" in the tuple E.

For example:

> SELECT (kv).v FROM kv;
 v
---
 2
 3
 3

Note: This is a mechanism that is distinct from the common column name resolution on data sources, e.g. kv.v. The relational expression is first reified as a tuple in the scalar context, and then one of the tuple columns is accessed.

Implementation wise, the expression of the form (E).label must be handled by a dedicated scalar expression node type, for example ColumnAccessExpr introduced in #24832.

Propagation across relational expression boundaries

Composite types are regular types, and propagate naturally across sub-queries.

For example:

>  SELECT (a).v FROM (SELECT kv AS a FROM kv) AS woo;
 v
---
 2
 3
 3

Implementation wise, CockroachDB must recognize composite types during type checking. We can perhaps use the existing types.Table for this purpose, some more investigation needed to check this.

"Star expansion" in composite expressions

If an expression of the form (E).* is present at the top-level of a render expression (a.k.a. "Select target expression", "projection expression" or SelectExpr in the current source code), then a special rule for star expansion kicks in:

  • the expression E is checked to have composite type (according to the rules above)
  • the render expr is replaced by a sequence of expressions of the form (@n).lbl1, (@n).lbl2, (@n).lbl3, ... where:
    • @n refers to the single IndexedVar of an underlying projection that will compute the composite expression just once
    • lbl1, lbl2, ... refer to the labels of the composite.

Note: this seems to be a separate mechanism than the regular star expansion.

Fancy example

This is the original example that motivates this issue:

> SELECT (i.keys).n FROM (SELECT information_schema._pg_expandarray(ARRAY[3,2,1]) AS keys) AS i;

This query really means the following:

> WITH i AS SELECT (tmp_tbl) AS keys -- generates a composite expr, rename it to `keys`
              FROM information_schema._pg_expandarray(ARRAY[3,2,1])) AS tmp_tbl
    SELECT (tmp_tuple).n  -- uses a tuple element access sub-expression
      FROM (SELECT keys AS tmp_tuple FROM i)

In particular in the scalar expression (i.keys).n:

  • i identifies a relational expression in a FROM clause
  • i.keys is a regular column reference in one of the FROM relational expressions, and designates the (single) column in table i
  • the column i.keys has composite type, with arity 2 and labels n and x
  • the expression (i.keys).n accesses the column named n in the tuple column i.keys.

So in order to support this query CockroachDB must be able to:

  1. lift an SRF in render position into the FROM clause and replace it by a composite constructor
  2. properly type composite expressions and propagate their types
  3. support the tuple element access expression for composite types

We can start the work by introducing the proper typing and evaluation rules and use types.Table for composite expressions. No need to support storing of composite values yet.

@knz knz added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Apr 17, 2018
@knz knz added this to the 2.1 milestone Apr 17, 2018
@knz knz added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Apr 17, 2018
@knz
Copy link
Contributor Author

knz commented Apr 17, 2018

cc @justinj @rytaft

@knz
Copy link
Contributor Author

knz commented Apr 17, 2018

@justinj @rytaft in the optimizer this will need a simplification transform:

SELECT (a).cx FROM (SELECT <tbl> AS a FROM ...) should be rewritten as

SELECT a_cx FROM (SELECT <tbl>.c1 AS a_c1, <tbl>.c2 AS a_c2, ... FROM ...)

So as to avoid run-time tuple construction and deconstruction.

@BramGruneir
Copy link
Member

This is an excellent summary Thanks so much for writing this down.

craig bot pushed a commit that referenced this issue Jun 13, 2018
26621:  sql,opt: propagate composite types (labeled tuples) r=knz a=knz

Needed to resolve #24866.

This patch adds more complete support for composite types (labeled
tuples) by ensuring the following:

- tuple labels are preserved during constant folding of tuple
  expressions.
- tuple labels are preserved during expression transformations
  in optimizations.
- subqueries in scalar contexts receive a composite type with labels.

Note that there is currently a bug in DTuple serialization which break
composite literals in distributed execution, so the composite type
support is not fully ready yet.
This is tracked as separate bug #26624.

Release note: None

26683: opt: don't push limit through project when ordering on synthesized column r=RaduBerinde a=RaduBerinde

It is invalid to push limit/offset through a projection if the
ordering depends on a synthesized column. Fix the rules to check for
this.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
@craig craig bot closed this as completed in #26621 Jun 13, 2018
craig bot pushed a commit that referenced this issue Jun 14, 2018
26628: sql: Support tuple column access and tuple stars r=knz a=knz

First commits from #26621.
Completes the fix to #24866 by re-activating disabled tests.
This work is yet another step towards #16971. It would actually fix #16971 if it were not for #26627, #26624 and #26629.

This work is yet another step towards #16971.

The labeled tuples introduced in #25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Fixes #26720.

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Bram Gruneir <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants