Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message for LISTAGG window function #16982

Closed
lukaseder opened this issue Apr 12, 2023 · 11 comments · Fixed by #24366
Closed

Improve error message for LISTAGG window function #16982

lukaseder opened this issue Apr 12, 2023 · 11 comments · Fixed by #24366

Comments

@lukaseder
Copy link

LISTAGG currently can't be used as a window function, it seems:

select listagg(c) within group (order by c) over ()
from (values ('a'), ('b')) as t (c)

This produces:

SQL Error [1]: Query failed (#20230412_115219_00199_whng5): line 1:50: mismatched input '('. Expecting: ',', 'EXCEPT', 'FETCH', 'FROM', 'GROUP', 'HAVING', 'INTERSECT', 'LIMIT', 'OFFSET', 'ORDER', 'UNION', 'WHERE', 'WINDOW',

I think it would be useful to either:

  • Make an explicit error message documenting that this isn't currently supported
  • Implement the feature (better)
@losipiuk
Copy link
Member

losipiuk commented Oct 8, 2024

@martint, @kasiafi what does SQL standard say? Should the listagg be supported in window context? It seems that some of special listagg syntax (ORDER BY) overlaps with what window function provide on their own.

@lukaseder
Copy link
Author

Many RDBMS don't support ordering both the aggregate function and the window function. Even without this support, it's still useful to be able to use listagg in a window context (e.g. per partition). If both types of ordering are supported, then they don't interfere. The window ORDER BY clause specifies the ordering of the window frame (relevant when using ROWS / RANGE, including the implicit default RANGE UNBOUNDED PRECEDING clause), whereas the aggregate ORDER BY clause specifies in which order the window contents are aggregated into a string.

@martint
Copy link
Member

martint commented Oct 8, 2024

It's not supported in standard SQL, but it makes sense to add support for listagg in a window context at some point. As @lukaseder explained above, the ORDER BY clauses are independent from each other, so there's no overlap.

@lukaseder
Copy link
Author

It's not supported in standard SQL

Why not? I don't see any such limitation in ISO/IEC 9075-2:2023(E) 6.10 <window function>

@martint
Copy link
Member

martint commented Oct 8, 2024

My bad, you're right -- I assumed it was a special case, but it's handled as any of the other aggregation functions.

@losipiuk
Copy link
Member

@martint based on quick playing with the code (I may be missing something) it looks like adding LISTAGG to be supported in the window context should be easy modulo the ordering part. Mixing ordering of aggregate function arguments and window functions is not supported in Trino right now.
Would that make sense to add support of LISTAGG in window function context, but not allow specifying WITHIN GROUP ... ORDER BY) ?

@lukaseder
Copy link
Author

Would that make sense to add support of LISTAGG in window function context, but not allow specifying WITHIN GROUP ... ORDER BY) ?

The window ordering is far less useful in this context than the aggregate ordering. I'd say, without aggregate ordering, the LISTAGG function becomes almost useless (?)

What I've seen in most other RDBMS that have difficulties supporting both orderings is that they don't support the window ordering but do support the aggregate ordering of this function.

@losipiuk
Copy link
Member

building block: #23929

@martint
Copy link
Member

martint commented Nov 7, 2024

Taking a closer look at the SQL spec, I found this in section 9.23:

8) If <window function type> is <aggregate function>, then:
   ...
  d) If the window ordering clause or the window framing clause of the window structure descriptor 
     that describes the <window name or specification> is present, then:
    i) AF shall not be an <ordered set function> or a <listagg set function>.

@lukaseder
Copy link
Author

Interesting finding. I had overlooked this line (though I did look at 8) d) ii)...). It does confirm that the aggregate order by clause is more useful than the window order by clause in this context, though I don't see why this limitation needs to be in the standard. The logical semantics of having both orderings seems clear, in my opinion (framing first, aggregation later)

@martint
Copy link
Member

martint commented Nov 7, 2024

Indeed. That’s why we’re going to keep this feature as an extension to what the standard allows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants