Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support aggregate functions with secondary arguments #10495

Closed
knz opened this issue Nov 7, 2016 · 9 comments
Closed

sql: support aggregate functions with secondary arguments #10495

knz opened this issue Nov 7, 2016 · 9 comments
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. help wanted Help is requested / needed by the one who filed the issue to fix it.
Milestone

Comments

@knz
Copy link
Contributor

knz commented Nov 7, 2016

We need aggregates with secondary arguments,
needed to implement the more general function STRING_AGG (
https://www.postgresql.org/docs/9.2/static/functions-aggregate.html ) - #26737

Implementation-wise we could assume as a startin point that aggregation occurs always on the first argument; and use the values after the first as initializer arguments for the aggregate function object.

@knz knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) help wanted Help is requested / needed by the one who filed the issue to fix it. E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. labels Nov 7, 2016
@yznming
Copy link
Contributor

yznming commented Dec 5, 2016

@knz I can try to fix it.

@knz
Copy link
Contributor Author

knz commented Dec 5, 2016

ok!

@yznming
Copy link
Contributor

yznming commented Dec 15, 2016

@knz do you have any advice on how to get the delimiter if the second argument is a expression?

postgres=# select string_agg(name, (case when id>2 then '|' else ']' end)) from t
;
 string_agg
------------
 ss]sss|fww
(1 row)

@knz
Copy link
Contributor Author

knz commented Dec 15, 2016

If you integrate the code like for other aggregate functions, the 2nd argument is already pre-evaluated.

So if you look at the other functions in sql/parser/builtins.go you see they use args[0], args[1], etc. for example overlay. You can convert the argument to *DString directly.

@heyihong
Copy link
Contributor

Hi @knz , is this issue still open and unassigned? I'd like to try to fix it if possible.

@knz
Copy link
Contributor Author

knz commented Dec 30, 2017

Yes the issue is still open an unassigned.

@cristhiank
Copy link

+1 for this issue. The Java JDBC Driver relies on the string_agg function to get the sql keywords as you can see in this link.

@knz
Copy link
Contributor Author

knz commented Apr 13, 2018

cc @awoods187 for prioritization.

@knz knz added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Apr 13, 2018
@knz knz modified the milestones: Later, 2.1 Apr 13, 2018
@knz
Copy link
Contributor Author

knz commented Apr 23, 2018

@jordanlewis I'm placing this issue both in SQL lang and SQL exec. Once the built-in is defined and the type checking / planning issues are resolved, the distSQL processor will need to be modified to handle this.

knz added a commit to knz/cockroach that referenced this issue Apr 29, 2018
CockroachDB does not yet support running aggregation functions with
more than one argument (cockroachdb#10495).

Prior to this patch, an error was properly generated for the situation
where an aggregate function is applied to multiple arguments in
"simple" contexts; however the check was absent when used in a window
frame, and attempting to use multiple arguments in that context would
cause a panic.

Intuitively, this code path should not have been reachable because
issue cockroachdb#10495 would otherwise suggest that no aggregate function
accepting more than one argument should even be defined.

Unfortunately, a previous change meant to experiment with distributed
execution of multi-argument aggregates introduced the definitions of
two such aggregates: `final_variance()` and `final_stddev()`.
This in turns enabled the code path leading to a crash.

To fix this, two avenues were possible: either remove the _definition_
multi-argument aggregates until cockroachdb#10495 is comprehensively addressed,
or fix the predicate that rejects such aggregates during _planning_.

This patch opts for the latter approach.

Release note (bug fix): CockroachDB now properly reports an error when
using the unsupported functions `final_variance()` and
`final_stddev()` instead of causing a crash.
knz added a commit to knz/cockroach that referenced this issue May 1, 2018
CockroachDB does not yet support running aggregation functions with
more than one argument (cockroachdb#10495).

Prior to this patch, an error was properly generated for the situation
where an aggregate function is applied to multiple arguments in
"simple" contexts; however the check was absent when used in a window
frame, and attempting to use multiple arguments in that context would
cause a panic.

Intuitively, this code path should not have been reachable because
issue cockroachdb#10495 would otherwise suggest that no aggregate function
accepting more than one argument should even be defined.

Unfortunately, a previous change meant to experiment with distributed
execution of multi-argument aggregates introduced the definitions of
two such aggregates: `final_variance()` and `final_stddev()`.
This in turns enabled the code path leading to a crash.

To remove access to this code path required removing access to these
particular built-in functions from the SQL dialect. This patch
achieves this by introducing a boolean flag `Private` on built-in
functions, set to true for these two, which prevent both use by SQL
queries and auto-generation of docs.

Release note (bug fix): CockroachDB now properly reports an error when
using the internal-only functions `final_variance()` and
`final_stddev()` instead of causing a crash.
knz added a commit to knz/cockroach that referenced this issue May 1, 2018
CockroachDB does not yet support running aggregation functions with
more than one argument (cockroachdb#10495).

Prior to this patch, an error was properly generated for the situation
where an aggregate function is applied to multiple arguments in
"simple" contexts; however the check was absent when used in a window
frame, and attempting to use multiple arguments in that context would
cause a panic.

Intuitively, this code path should not have been reachable because
issue cockroachdb#10495 would otherwise suggest that no aggregate function
accepting more than one argument should even be defined.

Unfortunately, a previous change meant to experiment with distributed
execution of multi-argument aggregates introduced the definitions of
two such aggregates: `final_variance()` and `final_stddev()`.
This in turns enabled the code path leading to a crash.

To remove access to this code path required removing access to these
particular built-in functions from the SQL dialect. This patch
achieves this by introducing a boolean flag `Private` on built-in
functions, set to true for these two, which prevent both use by SQL
queries and auto-generation of docs.

Release note (bug fix): CockroachDB now properly reports an error when
using the internal-only functions `final_variance()` and
`final_stddev()` instead of causing a crash.
knz added a commit to knz/cockroach that referenced this issue May 1, 2018
CockroachDB does not yet support running aggregation functions with
more than one argument (cockroachdb#10495).

Prior to this patch, an error was properly generated for the situation
where an aggregate function is applied to multiple arguments in
"simple" contexts; however the check was absent when used in a window
frame, and attempting to use multiple arguments in that context would
cause a panic.

Intuitively, this code path should not have been reachable because
issue cockroachdb#10495 would otherwise suggest that no aggregate function
accepting more than one argument should even be defined.

Unfortunately, a previous change meant to experiment with distributed
execution of multi-argument aggregates introduced the definitions of
two such aggregates: `final_variance()` and `final_stddev()`.
This in turns enabled the code path leading to a crash.

To remove access to this code path required removing access to these
particular built-in functions from the SQL dialect. This patch
achieves this by introducing a boolean flag `Private` on built-in
functions, set to true for these two, which prevent both use by SQL
queries and auto-generation of docs.

Release note (bug fix): CockroachDB now properly reports an error when
using the internal-only functions `final_variance()` and
`final_stddev()` instead of causing a crash.
craig bot pushed a commit that referenced this issue May 1, 2018
25100: sql: introduce SessionData.Copy r=andreimatei a=andreimatei

Make the SessionData struct copiable. This will be used to copy state
from a session to internal statements that need to be run in the same
context (particularly, with the same user and database).

This patch removes a mutex protecting sessionData.AppName. It used to be
that the field was accessed concurrently by connEx.Serialize() - a
method used by the "session registry". This patch creates an atomic that
parallels sessionData.AppName for that purpose.

Release note: None

25158: sql: fail gracefully on unsupported multi-arg aggregations r=knz a=knz

Fixes #23798.

CockroachDB does not yet support running aggregation functions with
more than one argument (#10495).

Prior to this patch, an error was properly generated for the situation
where an aggregate function is applied to multiple arguments in
"simple" contexts; however the check was absent when used in a window
frame, and attempting to use multiple arguments in that context would
cause a panic.

Intuitively, this code path should not have been reachable because
issue #10495 would otherwise suggest that no aggregate function
accepting more than one argument should even be defined.

Unfortunately, a previous change meant to experiment with distributed
execution of multi-argument aggregates introduced the definitions of
two such aggregates: `final_variance()` and `final_stddev()`.
This in turns enabled the code path leading to a crash.

To remove access to this code path required removing access to these
particular built-in functions from the SQL dialect. This patch
achieves this by introducing a boolean flag `Private` on built-in
functions, set to true for these two, which prevent both use by SQL
queries and auto-generation of docs.

Release note (bug fix): CockroachDB now properly reports an error when
using the internal-only functions `final_variance()` and
`final_stddev()` instead of causing a crash.

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 8, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 9, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 13, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

However, when used as a window function, the limitation of not having more than
one non-const expression is not required.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 14, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

However, when used as a window function, the limitation of not having more than
one non-const expression is not required.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 14, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

However, when used as a window function, the limitation of not having more than
one non-const expression is not required.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Aug 14, 2018
This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

However, when used as a window function, the limitation of not having more than
one non-const expression is not required.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes cockroachdb#10495, cockroachdb#26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.
craig bot pushed a commit that referenced this issue Aug 14, 2018
28392: sql: add the string_agg aggregation function r=BramGruneir a=BramGruneir

This function is similar to concat_agg but it takes a delimiter as a secondary
argument. Previously, we were not able to handle aggregations with more than
one argument before. To allow for this, without getting into the messy world of
multi-column aggregators, all arguments after the first one in an aggregator
must be constant expressions.

This in turn required updating the aggregator functions to now also take
argument datums as an new argument.

For distsql, the arguments are stored as expressions that have already been
checked to ensure that they are constants.

It looks like concat_agg (and now string_agg) are not run in distsql yet, so
this will be added next.

This work is primarily motivated by the need for greater ORM compatibility.

Closes #10495, #26737

Release note (sql change): Added the new aggregation function string_agg that
concats a collection of strings into a single string and seperates them with the
passed in delimiter.

28470: hlc: Prevent false positives in forward clock jump monitoring r=tschottdorf a=bdarnell

A recent change to the ordering of various startup events has led to a
gap between calls to hlc.Clock.Now, which looks like a forward clock
jump to first iteration of the monitoring goroutine.

Fixes #28367

Release note: None

Co-authored-by: Bram Gruneir <[email protected]>
Co-authored-by: Ben Darnell <[email protected]>
@craig craig bot closed this as completed in #28392 Aug 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-intermediate Intermediate complexity, needs a contributor with 3-6 months of past contribution experience. help wanted Help is requested / needed by the one who filed the issue to fix it.
Projects
None yet
Development

No branches or pull requests

6 participants