-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Introduce per agg filter #113735
ESQL: Introduce per agg filter #113735
Changes from 22 commits
839dbbf
586dabe
a9f4358
2e37927
dd13dc3
4d9039a
8b75cea
db9c98e
99fba42
3d1fe24
d02c654
91d91e6
bda7edd
1686ef3
fc426c4
5a1257a
8ccad9e
23c7c54
217335d
9bccde4
c6cadca
c04be1d
0938d91
21696a6
b10b7d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
pr: 113735 | ||
summary: "ESQL: Introduce per agg filter" | ||
area: ES|QL | ||
type: feature | ||
issues: [] | ||
highlight: | ||
title: "ESQL: Introduce per agg filter" | ||
body: |- | ||
Add support for aggregation scoped filters that work dynamically on the | ||
data in each group. | ||
|
||
[source,esql] | ||
---- | ||
| STATS success = COUNT(*) WHERE 200 <= code AND code < 300, | ||
redirect = COUNT(*) WHERE 300 <= code AND code < 400, | ||
client_err = COUNT(*) WHERE 400 <= code AND code < 500, | ||
server_err = COUNT(*) WHERE 500 <= code AND code < 600, | ||
total_count = COUNT(*) | ||
---- | ||
|
||
Implementation wise, the base AggregateFunction has been extended to | ||
allow a filter to be passed on. This is required to incorporate the | ||
filter as part of the aggregate equality/identity which would fail with | ||
the filter as an external component. | ||
As part of the process, the serialization for the existing aggregations | ||
had to be fixed so AggregateFunction implementations so that it | ||
delegates to their parent first. | ||
notable: true |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Constant aggregations with WHERE are not tested and currently incorrect, at least for those that rely on Reproducer:
The query should return 1 as only 1 row satisfies There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There might be more to it, since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's because there's no true folding of the aggregation but instead we try to rewrite them into MV_functions + case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a side note, when the filters are foldable, they evaluate to either true (essentially discarded) or false meaning the agg won't run and can be folded to its initial value, 0 for count and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright, let's address the const case later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Opened #115522 to call this out explicitly as it's actually producing incorrect results.
alex-spies marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More things to test:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch however I believe this is a bug. Grouping functions should only be allowed in the BY clause - here it could work it's the same as a grouping however I tried it and we allow a bucket with different field and argument. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The semantic equality for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -123,17 +123,15 @@ fields | |
; | ||
|
||
field | ||
: booleanExpression | ||
| qualifiedName ASSIGN booleanExpression | ||
: (qualifiedName ASSIGN)? booleanExpression | ||
; | ||
|
||
fromCommand | ||
: FROM indexPattern (COMMA indexPattern)* metadata? | ||
; | ||
|
||
indexPattern | ||
: clusterString COLON indexString | ||
| indexString | ||
: (clusterString COLON)? indexString | ||
Comment on lines
+126
to
+134
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Small improvements to the grammar that don't change the parsing. |
||
; | ||
|
||
clusterString | ||
|
@@ -159,15 +157,23 @@ deprecated_metadata | |
; | ||
|
||
metricsCommand | ||
: DEV_METRICS indexPattern (COMMA indexPattern)* aggregates=fields? (BY grouping=fields)? | ||
: DEV_METRICS indexPattern (COMMA indexPattern)* aggregates=aggFields? (BY grouping=fields)? | ||
; | ||
|
||
evalCommand | ||
: EVAL fields | ||
; | ||
|
||
statsCommand | ||
: STATS stats=fields? (BY grouping=fields)? | ||
: STATS stats=aggFields? (BY grouping=fields)? | ||
; | ||
|
||
aggFields | ||
: aggField (COMMA aggField)* | ||
; | ||
|
||
aggField | ||
: field {this.isDevVersion()}? (WHERE booleanExpression)? | ||
; | ||
|
||
qualifiedName | ||
|
@@ -316,5 +322,5 @@ lookupCommand | |
; | ||
|
||
inlinestatsCommand | ||
: DEV_INLINESTATS stats=fields (BY grouping=fields)? | ||
: DEV_INLINESTATS stats=aggFields (BY grouping=fields)? | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: I think it's going to be important to teach users with good examples that
is fundamentally different from
I expect users will try stuff like
which is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't currently verified (which is what the comment in Verifier is about). But wondering if this type of aggregation (with no group) should be invalid at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand correctly; but
is invalid because the
WHERE
inside the aggregation refers to the result of the aggregation. To be correct, the second part of the predicate needs to be moved into a separateWHERE
command: