-
Notifications
You must be signed in to change notification settings - Fork 79
BucketSelector pipeline aggregation extension #374
BucketSelector pipeline aggregation extension #374
Conversation
…gator on a MultiBucketAggregation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments but the main logic looks good to me. You can merge this in and address the comments in the follow-up PRs if you prefer so you're not blocked on sending those out.
val mapSize: Int = sin.readVInt() | ||
bucketsPathsMap = java.util.HashMap(mapSize) | ||
for (i in 0 until mapSize) { | ||
bucketsPathsMap[sin.readString()] = sin.readString() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can alternatively be replaced with bucketsPathsMap = sin.readMap() as Map<String, String>
out.writeVInt(bucketsPathsMap.size) | ||
for ((key, value) in bucketsPathsMap) { | ||
out.writeString(key) | ||
out.writeString(value) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, this can be replaced with out.writeMap(bucketsPathsMap as Map<String, String>)
|
||
@Throws(IOException::class) | ||
public override fun internalXContent(builder: XContentBuilder, params: Params): XContentBuilder { | ||
builder.field(PipelineAggregator.Parser.BUCKETS_PATH.preferredName, bucketsPathsMap as Map<String, Any>?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP: Builder calls can be chained to reduce text.
Ex.
builder.field()
.field()
.field()
private val PARENT_BUCKET_PATH = ParseField("parent_bucket_path") | ||
|
||
@Throws(IOException::class) | ||
fun parse(reducerName: String, parser: XContentParser): BucketSelectorExtAggregationBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clean this up a bit and be more consistent with our other parse
functions, I think this can be simplified to assume that parse is being called on the start_object
of bucket_select_ext
. This way, we can fetch the field name and then the next token should be the contents of the field. Then within the single when
, we can cover the different formats of the field being parsed.
Ex.
fun parse(reducerName: String, xcp: XContentParser): BucketSelectorExtAggregationBuilder {
var bucketsPathsMap: MutableMap<String, String>? = null
var gapPolicy: GapPolicy? = null
var script: Script? = null
var parentBucketPath: String? = null
var filter: BucketSelectorExtFilter? = null
ensureExpectedToken(Token.START_OBJECT, xcp.currentToken(), xcp)
while(xcp.nextToken() != Token.END_OBJECT) {
val fieldName = xcp.currentName()
xcp.nextToken()
when (fieldName) {
PipelineAggregator.Parser.BUCKETS_PATH -> {
if (xcp.currentToken == Token.START_OBJECT) {
...
} else if (xcp.current == Token.START_ARRAY) {
while (xcp.nextToken() != Token.END_ARRAY) {
...
}
} else {
...
}
}
PipelineAggregator.Parser.GAP_POLICY -> { ... }
Script.SCRIPT_PARSE_FIELD -> { ... }
PARENT_BUCKET_PATH -> { ... }
else -> { ... }
}
}
...
}
constructor(sin: StreamInput) : super(sin.readString(), null, null) { | ||
script = Script(sin) | ||
gapPolicy = GapPolicy.readFrom(sin) | ||
bucketsPathsMap = sin.readGenericValue() as Map<String, String> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can make this sin.readMap()
to be a little more explicit
import java.util.function.Consumer | ||
import java.util.function.Function | ||
|
||
class BucketSelectorExtAggregatorTestsIT : AggregatorTestCase() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP: We can probably just call these BucketSelectorExtAggregatorTests
(same for BucketSelectExtAggregationBuilderTestsIT
) since they're more unit tests. The IT tests in this package typically extend ODFE/ESRestTestCase()
and make the actual API calls on the test cluster.
3bc057d
into
opendistro-for-elasticsearch:doc-level-alerting-dev
Description of changes:
BucketSelectorExt is an extension to BucketSelector pipeline aggregation. Some of the limitations with BucketSelector -
With BucketSelectorExt we are trying to address above limitations.
filter
for each source of the composite aggregation. For this, we have introduced a newkey, value
object wherekey
is the name of thesource
andvalue
is thekey filter
for the correspondingsource
in composite aggregation. Refer examples below.Parameters:
parent_bucket_path - this is to navigate to the right parent multi-bucket aggregation on which selector has to be applied. It supports nested aggregations but should comply with below constraint -
agg1>agg2>agg3
- whereagg1
andagg2
are all single-bucket aggs. Whereas,agg3
i.e. the last aggregation in the hierarchy should be a multi-bucket aggregation on which bucket selector would be applicable.buckets_path - this is same as existing BucketSelector buckets_path
script - this is same as existing BucketSelector script
filter - key filter condition. First keys are filtered and then the bucket selector scripts are executed on the filtered keys.
It containsinclude/exclude filter which works on the lines of term aggregation filtering supported by elasticsearch.
composite_agg_filter: key filter condition for composite aggregations. Refer to example below for usage.
Some usage examples -
For regex, refer lucene regular expression
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.