[ML] adding support for composite aggs in anomaly detection #69970

benwtrent · 2021-03-04T14:04:14Z

This commit allows for composite aggregations in datafeeds.

Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations.

The restrictions for this support are as follows:

The composite aggregation must have EXACTLY one date_histogram source
The sub-aggs of the composite aggregation must have a max aggregation on the SAME timefield as the aforementioned date_histogram source
The composite agg must be the ONLY top level agg and it cannot have a composite or date_histogram sub-agg
If using a date_histogram to bucket time, it cannot have a composite sub-agg.
The top-level composite agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket).

Some key user interaction differences:

Speed + resources used by the cluster should be controlled by the size parameter in the composite aggregation. Previously, we said if you are using aggs, use a specific chunking_config. But, with composite, that is not necessary.
Users really shouldn't use nested terms aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with composite it is.
I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.

Example

here is a job that was traditionally restricted to using a scroll datafeed but now can be completed with a composite agg:
Job:

{
  "custom_settings": {
    "created_by": "ml-module-sample",
    "custom_urls": [
      {
        "url_name": "Raw data",
        "url_value": """discover#/?_g=(time:(from:'$earliest$',mode:absolute,to:'$latest$'))&_a=(index:'90943e30-9a47-11e8-b64d-95841ca0b247',query:(language:kuery,query:'response.keyword:"$response.keyword$"'),sort:!('@timestamp',desc))"""
      },
      {
        "url_name": "Data dashboard",
        "url_value": "dashboards#/view/edf84fe0-e1a0-11e7-b6d5-4dc382ef7f5b?_g=(filters:!(),time:(from:'$earliest$',mode:absolute,to:'$latest$'))&_a=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'90943e30-9a47-11e8-b64d-95841ca0b247',key:response.keyword,negate:!f,params:(query:'$response.keyword$'),type:phrase,value:'$response.keyword$'),query:(match:(response.keyword:(query:'$response.keyword$',type:phrase))))),query:(language:kuery,query:''))"
      }
    ]
  },
  "analysis_config": {
    "bucket_span": "1h",
    "detectors": [
      {
        "detector_description": "Event rate by response code",
        "function": "count",
        "partition_field_name": "response.keyword",
        "detector_index": 0
      }
    ],
    "influencers": [
      "clientip",
      "response.keyword"
    ],
    "summary_count_field_name": "doc_count"
  },
  "analysis_limits": {
    "model_memory_limit": "11mb"
  },
  "model_plot_config": {
    "enabled": false,
    "annotations_enabled": true
  },
  "data_description": {
    "time_field": "timestamp",
    "time_format": "epoch_ms"
  },
 "results_index_name": "custom-simple-response_code_rates"
}

datafeed:

{
    "job_id": "agged-response_code_rates",
    "query": {
      "bool": {
        "filter": [
          {
            "term": {
              "event.dataset": "sample_web_logs"
            }
          }
        ]
      }
    },
    "indices": [
      "kibana_sample_data_logs"
    ],
    "delayed_data_check_config": {
      "enabled": true
    },
    "aggregations": {
      "buckets": {
        "composite": {
          "size": 10000,
          "sources": [
            {
              "time": {
                "date_histogram": {
                  "field": "timestamp",
                  "fixed_interval": "15m"
                }
              }
            },
            {
              "response.keyword": {
                "terms": {
                  "field": "response.keyword"
                }
              }
            },
            {
              "clientip": {
                "terms": {
                  "field": "clientip"
                }
              }
            }
          ]
        },
        "aggs": {
          "timestamp": {
            "max": {
              "field": "timestamp"
            }
          }
        }
      }
    }
  }

All the links, results, etc. are the same as if this was done via a regular scroll.

Even the UI works as the internal aggregations are simple enough (just a count). This is not always the case, as with aggregations now, they are so flexible that the UI can have issues recreating the underlying visuals. But, in this particular case, it works fine because there is no internal term aggregation (just in the composite definition).

The UI should probably validate the composite agg definition to make sure it is only date_histogram and term sources. But, support for filtering by the other composite sources shouldn't be too difficult in time. Kibana issue

elasticmachine · 2021-03-04T14:04:17Z

Pinging @elastic/ml-core (Team:ML)

benwtrent · 2021-03-04T14:11:35Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

-TIP: If you use a terms aggregation and the cardinality of a term is high, the
-aggregation might not be effective and you might want to just use the default
-search and scroll behavior.
+TIP: If you use a terms aggregation and the cardinality of a term is high, you


@szabosteve Mind reviewing all these doc updates? Its pretty encompassing.

szabosteve

Thanks for amending the docs! I left a couple of minor comments.

szabosteve · 2021-03-04T14:23:37Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

-TIP: If you use a terms aggregation and the cardinality of a term is high, the
-aggregation might not be effective and you might want to just use the default
-search and scroll behavior.
+TIP: If you use a terms aggregation and the cardinality of a term is high, you


Suggested change

TIP: If you use a terms aggregation and the cardinality of a term is high, you

TIP: If you use a terms aggregation and the cardinality of a term is high,

szabosteve · 2021-03-04T14:23:52Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

-aggregation might not be effective and you might want to just use the default
-search and scroll behavior.
+TIP: If you use a terms aggregation and the cardinality of a term is high, you
+should use composite aggregations.


Suggested change

should use composite aggregations.

use composite aggregations instead.

szabosteve · 2021-03-04T14:27:35Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

+of each bucket is the time of the last record in the bucket.
+
+For `composite` aggregation support, there must be exactly one `date_histogram` value
+source. Additionally, that value source must NOT be sorted in descending order. Additional


Suggested change

source. Additionally, that value source must NOT be sorted in descending order. Additional

source. That value source must not be sorted in descending order. Additional

szabosteve · 2021-03-04T14:53:18Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

 `responsetime`.

-Your {dfeed} can contain multiple aggregations, but only the ones with names 
+TIP: If you are utilizing a `term` aggregation to gather influencer or partition
+detector field information, consider using a `composite` aggregation. It will perform


Suggested change

detector field information, consider using a `composite` aggregation. It will perform

detector field information, consider using a `composite` aggregation. It performs

szabosteve · 2021-03-04T14:53:35Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

-Your {dfeed} can contain multiple aggregations, but only the ones with names 
+TIP: If you are utilizing a `term` aggregation to gather influencer or partition
+detector field information, consider using a `composite` aggregation. It will perform
+better than a `date_histogram` with a nested `term` aggregation and will also include


Suggested change

better than a `date_histogram` with a nested `term` aggregation and will also include

better than a `date_histogram` with a nested `term` aggregation and also includes

szabosteve · 2021-03-04T15:02:58Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

+aggregation `term` source with the name `airline` works. Note its name
+is the same as the field.
+<4> The required `max` aggregation whose name is the time field in the
+job analysis config


Suggested change

job analysis config

job analysis config.

szabosteve · 2021-03-04T15:03:46Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc


+When using a `date_histogram` aggregation to bucket by time


Suggested change

When using a `date_histogram` aggregation to bucket by time

When using a `date_histogram` aggregation to bucket by time:

szabosteve · 2021-03-04T15:03:56Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

@@ -282,26 +379,64 @@ When you define an aggregation in a {dfeed}, it must have the following form:
 ----------------------------------
 // NOTCONSOLE

+When using a `composite` aggregation


Suggested change

When using a `composite` aggregation

When using a `composite` aggregation:

szabosteve · 2021-03-04T15:05:09Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

-aggregation. For more information, see
-{ref}/search-aggregations-bucket-datehistogram-aggregation.html[Date histogram aggregation].
+sub-aggregation that is a `date_histogram`, the top level aggregation is the
+required `date_histogram`, or the top leve aggregation is the required `composite`.


Suggested change

required `date_histogram`, or the top leve aggregation is the required `composite`.

required `date_histogram`, or the top level aggregation is the required `composite`.

szabosteve · 2021-03-04T15:06:40Z

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc


 You can optionally specify a terms aggregation, which creates buckets for
 different values of a field.

+TIP: Instead of nesting a `term` aggregation, try using `composite` aggs.


Suggested change

TIP: Instead of nesting a `term` aggregation, try using `composite` aggs.

TIP: Instead of nesting a `term` aggregation, use `composite` aggs.

…ed-composite-aggs

szabosteve

Docs LGTM! 👍

dimitris-athanasiou · 2021-03-05T10:29:46Z

The datafeed config does not seem to match the job config in the description.

benwtrent · 2021-03-05T12:41:37Z

@dimitris-athanasiou updated :). I copied the wrong one from my kibana console!

dimitris-athanasiou

Very very cool! Just a few minor things.

.../plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/datafeed/DatafeedConfigTests.java

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportStartDatafeedAction.java

...gin/ml/src/main/java/org/elasticsearch/xpack/ml/datafeed/extractor/DataExtractorFactory.java

...va/org/elasticsearch/xpack/ml/datafeed/extractor/aggregation/AggregationToJsonProcessor.java

...elasticsearch/xpack/ml/datafeed/extractor/aggregation/CompositeAggregationDataExtractor.java

benwtrent · 2021-03-05T14:30:36Z

OK, doing some additional testing, we are running into an issue with requiring composite aggs to use the max timestamp method for keeping buckets sorted. When the parent agg was ONLY a date_histogram this is not really an issue. But composite aggs order buckets like this.

[date_histogram, terms1, terms2....]

It is possible that bucket [date_histogram_bucket_1, terms_a, ...] has an EARLIER max timestamp than [date_histogram_bucket_1, terms_b, ....]. This means the timestamped data is technically sent out of order to the native process.

This can result in many THOUSANDS of out of order records and cause processing to slow to a crawl as the model has to constantly reorder buckets as they come in.

This needs to be thought about further before merging this PR.

How do we handle bucket ordering in composite aggs while not risking the model seeing the data twice?

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/datafeed/DatafeedConfig.java

dimitris-athanasiou · 2021-03-17T13:15:01Z

...va/org/elasticsearch/xpack/ml/datafeed/extractor/aggregation/AggregationToJsonProcessor.java

+            Iterator<Map.Entry<Long, List<Map<String, Object>>>> iterator = docsByBucketTimestamp.entrySet().iterator();
+            while (iterator.hasNext()) {
+                Map.Entry<Long, List<Map<String, Object>>> entry = iterator.next();
+                if (shouldCancel.test(entry.getKey())) {


OK, I thought about this and the discrepancy with the other version of writeDocs. I believe we can just keep the new version and use it where we used the old version as well.

Here is a bit of the story here.

The first implementation of allowing datafeeds with aggs to cancel added the idea of looking at how many key-value pairs we wrote. At that point, the implementation was not checking if we wrote the entire bucket. Thus, it was possible to stop the datafeed half through the bucket. This was fixed later, when the current implementation was added.

The current implementation collects data in buckets. We then write whole buckets and only check to see if we should cancel after a whole bucket was written. We still look for whether we reached the key-value pair batch size of 1000. But I think this is completely unnecessary now.

The point of checking key-value pairs was to create a good moment to check whether we should cancel without having to check after each record. But checking after each histogram bucket is good enough. I hope this makes sense.

Having said that, the other benefit of splitting is to contain the size of the output stream. But I guess this might be unnecessary as it is bound by the size of the search response, right? If the search response already fits in memory, then the output stream we create shouldn't be a problem.

…ed-composite-aggs

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

Co-authored-by: Lisa Cawley <[email protected]>

…ed-composite-aggs

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

…s.asciidoc Co-authored-by: István Zoltán Szabó <[email protected]>

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc

…s.asciidoc Co-authored-by: István Zoltán Szabó <[email protected]>

benwtrent · 2021-03-26T18:11:24Z

run elasticsearch-ci/2

dimitris-athanasiou

Looks good! Just a couple of minor comments and we're set to go.

dimitris-athanasiou · 2021-03-30T10:51:15Z

...in/core/src/main/java/org/elasticsearch/xpack/core/ml/datafeed/extractor/ExtractorUtils.java

+                return (DateHistogramValuesSourceBuilder)valuesSourceBuilder;
+            }
+        }
+        return null;


It seems that every caller of this assumes it doesn't return null. I think this is because at this point we have checked there is a histogram aggregation specified. I wonder if we should just throw an IllegalStateException here instead of returning null and remove the assertions in callers.

dimitris-athanasiou · 2021-03-30T10:54:26Z

...-node-tests/src/javaRestTest/java/org/elasticsearch/xpack/ml/integration/DatafeedJobsIT.java

+            Collections.singletonList(indexName))
+            .setParsedAggregations(aggs)
+            .setFrequency(TimeValue.timeValueHours(1))
+            .setChunkingConfig(ChunkingConfig.newManual(TimeValue.timeValueHours(1)))


Why are we setting manual chunking of 1 hour and later update it to auto? It might be something worth capturing in a comment in the test.

dimitris-athanasiou · 2021-03-30T10:56:58Z

...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportStopDatafeedAction.java

+                            .execute(ActionListener.wrap(
+                                _unused -> listener.onResponse(finished),
+                                ex -> {
+                                    logger.info(


Should this be an error?

How about a warn? error seems like some sort of system failure and failing to refresh the job is not a horrific thing in this scenario.

warn is even more appropriate indeed

…ed-composite-aggs

dimitris-athanasiou

LGTM

…69970) This commit allows for composite aggregations in datafeeds. Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations. The restrictions for this support are as follows: - The composite aggregation must have EXACTLY one `date_histogram` source - The sub-aggs of the composite aggregation must have a `max` aggregation on the SAME timefield as the aforementioned `date_histogram` source - The composite agg must be the ONLY top level agg and it cannot have a `composite` or `date_histogram` sub-agg - If using a `date_histogram` to bucket time, it cannot have a `composite` sub-agg. - The top-level `composite` agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket). Some key user interaction differences: - Speed + resources used by the cluster should be controlled by the `size` parameter in the `composite` aggregation. Previously, we said if you are using aggs, use a specific `chunking_config`. But, with composite, that is not necessary. - Users really shouldn't use nested `terms` aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with `composite` it is. - I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.

…9970) (#71052) * [ML] adding support for composite aggs in anomaly detection (#69970) This commit allows for composite aggregations in datafeeds. Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations. The restrictions for this support are as follows: - The composite aggregation must have EXACTLY one `date_histogram` source - The sub-aggs of the composite aggregation must have a `max` aggregation on the SAME timefield as the aforementioned `date_histogram` source - The composite agg must be the ONLY top level agg and it cannot have a `composite` or `date_histogram` sub-agg - If using a `date_histogram` to bucket time, it cannot have a `composite` sub-agg. - The top-level `composite` agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket). Some key user interaction differences: - Speed + resources used by the cluster should be controlled by the `size` parameter in the `composite` aggregation. Previously, we said if you are using aggs, use a specific `chunking_config`. But, with composite, that is not necessary. - Users really shouldn't use nested `terms` aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with `composite` it is. - I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.

[ML] adding support for composite aggs in anomaly detection

8bb5c5c

benwtrent added >enhancement :ml Machine learning v8.0.0 v7.13.0 labels Mar 4, 2021

elasticmachine added the Team:ML Meta label for the ML team label Mar 4, 2021

benwtrent commented Mar 4, 2021

View reviewed changes

dimitris-athanasiou self-requested a review March 4, 2021 14:30

benwtrent mentioned this pull request Mar 4, 2021

[ML] validate datafeed composite aggregation support in Single metric viewer elastic/kibana#93610

Closed

szabosteve reviewed Mar 4, 2021

View reviewed changes

benwtrent added 2 commits March 4, 2021 10:53

Merge remote-tracking branch 'upstream/master' into feature/ml-dafafe…

5262ef7

…ed-composite-aggs

updating docs

dfa7abc

benwtrent requested a review from szabosteve March 4, 2021 15:59

szabosteve approved these changes Mar 4, 2021

View reviewed changes

fixing docs

dd69d16

dimitris-athanasiou reviewed Mar 5, 2021

View reviewed changes

benwtrent commented Mar 5, 2021

View reviewed changes

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/datafeed/DatafeedConfig.java Show resolved Hide resolved

benwtrent mentioned this pull request Mar 5, 2021

Composite aggs seems to sort too slowly with filter queries #70035

Closed

benwtrent added 3 commits March 5, 2021 16:05

addressing PR comments

d9e7c6f

fixing bugs and addressing PR comments

3fc4159

adding docs around composite agg limitations

4fc5492

dimitris-athanasiou reviewed Mar 17, 2021

View reviewed changes

benwtrent added 4 commits March 17, 2021 09:36

adding integration test

673460c

Merge remote-tracking branch 'upstream/master' into feature/ml-dafafe…

8397358

…ed-composite-aggs

unifying agg to json processor methods

b0d6ba8

removing unused batching parameter

767bf04

lcawl reviewed Mar 23, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

lcawl reviewed Mar 23, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

lcawl reviewed Mar 23, 2021

View reviewed changes

benwtrent commented Mar 24, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

benwtrent commented Mar 24, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

benwtrent and others added 6 commits March 24, 2021 08:41

Apply suggestions from code review

0ff5f47

Co-authored-by: Lisa Cawley <[email protected]>

fixing anchor

a248998

fixing anchor again

5152978

[DOCS] Fixes broken links

1878a19

Merge remote-tracking branch 'upstream/master' into feature/ml-dafafe…

eb7d11d

…ed-composite-aggs

fixing docs

cc6e470

szabosteve reviewed Mar 26, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

Update docs/reference/ml/anomaly-detection/ml-configuring-aggregation…

825d4a7

…s.asciidoc Co-authored-by: István Zoltán Szabó <[email protected]>

szabosteve reviewed Mar 26, 2021

View reviewed changes

docs/reference/ml/anomaly-detection/ml-configuring-aggregations.asciidoc Outdated Show resolved Hide resolved

Update docs/reference/ml/anomaly-detection/ml-configuring-aggregation…

94b6833

…s.asciidoc Co-authored-by: István Zoltán Szabó <[email protected]>

dimitris-athanasiou reviewed Mar 30, 2021

View reviewed changes

benwtrent added 2 commits March 30, 2021 07:19

Merge remote-tracking branch 'upstream/master' into feature/ml-dafafe…

05ffe9b

…ed-composite-aggs

addressing PR comment

47f3d40

benwtrent requested a review from dimitris-athanasiou March 30, 2021 12:19

dimitris-athanasiou approved these changes Mar 30, 2021

View reviewed changes

benwtrent merged commit c8415a7 into elastic:master Mar 30, 2021

benwtrent deleted the feature/ml-dafafeed-composite-aggs branch March 30, 2021 12:25

benwtrent mentioned this pull request Mar 30, 2021

[7.x] [ML] adding support for composite aggs in anomaly detection (#69970) #71052

Merged

benwtrent mentioned this pull request Apr 1, 2021

[ML] Support Composite aggregations in the datafeed #37757

Closed

stevejgordon mentioned this pull request Apr 21, 2021

7.13.0 Meta Ticket elastic/elasticsearch-net#5584

Closed

62 tasks

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] adding support for composite aggs in anomaly detection #69970

[ML] adding support for composite aggs in anomaly detection #69970

benwtrent commented Mar 4, 2021 •

edited

Loading

elasticmachine commented Mar 4, 2021

benwtrent Mar 4, 2021

szabosteve left a comment

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve Mar 4, 2021

szabosteve left a comment

dimitris-athanasiou commented Mar 5, 2021

benwtrent commented Mar 5, 2021

dimitris-athanasiou left a comment

benwtrent commented Mar 5, 2021

dimitris-athanasiou Mar 17, 2021

dimitris-athanasiou Mar 17, 2021

benwtrent commented Mar 26, 2021

dimitris-athanasiou left a comment

dimitris-athanasiou Mar 30, 2021

dimitris-athanasiou Mar 30, 2021

dimitris-athanasiou Mar 30, 2021

benwtrent Mar 30, 2021

dimitris-athanasiou Mar 30, 2021

dimitris-athanasiou left a comment

	TIP: If you use a terms aggregation and the cardinality of a term is high, you
	TIP: If you use a terms aggregation and the cardinality of a term is high,

	should use composite aggregations.
	use composite aggregations instead.

	source. Additionally, that value source must NOT be sorted in descending order. Additional
	source. That value source must not be sorted in descending order. Additional

	detector field information, consider using a `composite` aggregation. It will perform
	detector field information, consider using a `composite` aggregation. It performs

	better than a `date_histogram` with a nested `term` aggregation and will also include
	better than a `date_histogram` with a nested `term` aggregation and also includes

	When using a `date_histogram` aggregation to bucket by time
	When using a `date_histogram` aggregation to bucket by time:

	When using a `composite` aggregation
	When using a `composite` aggregation:

	required `date_histogram`, or the top leve aggregation is the required `composite`.
	required `date_histogram`, or the top level aggregation is the required `composite`.

	TIP: Instead of nesting a `term` aggregation, try using `composite` aggs.
	TIP: Instead of nesting a `term` aggregation, use `composite` aggs.

[ML] adding support for composite aggs in anomaly detection #69970

[ML] adding support for composite aggs in anomaly detection #69970

Conversation

benwtrent commented Mar 4, 2021 • edited Loading

Example

elasticmachine commented Mar 4, 2021

Choose a reason for hiding this comment

szabosteve left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szabosteve left a comment

Choose a reason for hiding this comment

dimitris-athanasiou commented Mar 5, 2021

benwtrent commented Mar 5, 2021

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

benwtrent commented Mar 5, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benwtrent commented Mar 26, 2021

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

benwtrent commented Mar 4, 2021 •

edited

Loading