-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: make GROUP BY recognize projection aliases #28059
Comments
cc @RaduBerinde |
Started to investigate. The problem is worse than that said above, actually the resolution may currently be incorrect also for ORDER BY. |
See the code in pg's The following is also valid and needs to be checked:
|
@knz Can I get a quick blurb describing this known limitation w/r/t the impact to user experience? Ideally, we need it by Friday 10/26 for the 2.1 Known Limitations page. Posting it on this issue and/or pinging me would be great. |
""" Applications developed for PostgreSQL that use GROUP BY to refer to column aliases produced in the same SELECT clause must be changed to use the full underlying expression instead. For example, |
We are running tight on 19.1, I am pushing this out given that the fix is not trivial and the "surface area" is not very high. |
I think aliases and GROUP BY do not play well in most circumstances: This came up when testing with Django: cockroachdb/django-cockroachdb#82
|
We should prioritize fixing this and backport to 19.2.x. I can take a look this week. |
Trying to understand the postgres semantics.. Given the above examples, I would have expected that the aliases take precedence over the input columns, but that doesn't seem to be the case:
They do seem to take precedence over columns from higher scopes:
Here |
The comment in the pg source file I pointed to at the top explains the special cases I think.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
Thanks @knz, that definitely helped. I will try to harvest some test cases from postgres as well. |
I would like to correct something in the original description. The aliases are used only if they don't conflict with the FROM column names, which have precedence. So in the
|
42447: opt: support grouping by aliases r=RaduBerinde a=RaduBerinde There is some baggage left over from SQL92 which allowed grouping by select targets by their alias. We implement the same rules used by postgres, as explained in the `buildGroupingColumns` comment. Fixes #28059. Release note (sql change): It is now supported to specify selection target aliases as GROUP BY columns. Note that the FROM columns take precedence over the aliases, which are only used if there is no column with that name in the current scope. Co-authored-by: Radu Berinde <[email protected]>
There is some baggage left over from SQL92 which allowed grouping by select targets by their alias. We implement the same rules used by postgres, as explained in the `buildGroupingColumns` comment. Fixes cockroachdb#28059. Release note (sql change): It is now supported to specify selection target aliases as GROUP BY columns. Note that the FROM columns take precedence over the aliases, which are only used if there is no column with that name in the current scope.
There is some baggage left over from SQL92 which allowed grouping by select targets by their alias. We implement the same rules used by postgres, as explained in the `buildGroupingColumns` comment. Fixes cockroachdb#28059. Release note (sql change): It is now supported to specify selection target aliases as GROUP BY columns. Note that the FROM columns take precedence over the aliases, which are only used if there is no column with that name in the current scope.
@RaduBerinde, since this issue is closed, does that mean we can remove this limitation from the 20.1 known limitations? https://www.cockroachlabs.com/docs/dev/known-limitations.html#group-by-referring-to-select-aliases |
@jseldess yes, thanks. |
Reported by @clanstyles on Gitter.
This is a subtle extension to the SQL standard recognized by PostgreSQL which currently causes CockroachDB to silently return different results from pg in common cases. So the severity is high.
To understand the problem, let's start with the "easy" and innocuous form of the bug:
This is recognized by PostgreSQL (groups by the projected expression
x+1
), but not by CockroachDB currently (reports an errorz must be grouped by
).The difference between pg and crdb here is that GROUP BY will recognize simple/naked projection aliases before resolving the names using the regular algorithm (like ORDER BY already does.) This is a behavior specific to "simple identifiers" (like for ORDER BY).
Why this matters is that CockroachDB fails to report an error in pretty common cases, and instead returns, silently, very different results from PostgreSQL:
This currently groups by
t.x
in CockroachDB, whereas it should group by(x%10)
like in PostgreSQL!Both the heuristic planner and the opt code must be extended to support this!
The text was updated successfully, but these errors were encountered: