Skip to content

Commit

Permalink
[DOCS] Moves anomaly detection concepts (#1750)
Browse files Browse the repository at this point in the history
  • Loading branch information
lcawl authored Jul 9, 2021
1 parent f448a28 commit e760058
Show file tree
Hide file tree
Showing 15 changed files with 125 additions and 116 deletions.
16 changes: 0 additions & 16 deletions docs/en/stack/ml/anomaly-detection/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,21 +1,5 @@
include::ml-ad-overview.asciidoc[]

include::ml-concepts.asciidoc[leveloffset=+1]

include::ml-jobs.asciidoc[leveloffset=+2]

include::ml-datafeeds.asciidoc[leveloffset=+2]

include::ml-buckets.asciidoc[leveloffset=+2]

include::ml-influencers.asciidoc[leveloffset=+2]

include::ml-calendars.asciidoc[leveloffset=+2]

include::ml-rules.asciidoc[leveloffset=+2]

include::ml-model-snapshots.asciidoc[leveloffset=+2]

include::ml-ad-finding-anomalies.asciidoc[leveloffset=+1]

include::ml-ad-concepts.asciidoc[leveloffset=+1]
Expand Down
4 changes: 3 additions & 1 deletion docs/en/stack/ml/anomaly-detection/job-tips.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ results.
[[ml-ad-bucket-span]]
== Bucket span

include::ml-buckets.asciidoc[tag=buckets]

The bucket span is the time interval that {ml} analytics use to summarize and
model data for your job. When you create an {anomaly-job} in {kib}, you can
choose to estimate a bucket span value based on your data characteristics.
Expand Down Expand Up @@ -53,7 +55,7 @@ duplicates if they have the same `function`, `field_name`, `by_field_name`,
[[ml-ad-influencers]]
== Influencers

See <<ml-influencers>>.
include::ml-influencers.asciidoc[tag=influencers]


[discrete]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,10 @@ data by observing historical behavior and adapting to new data. The model
represents a baseline of normal behavior and can therefore be used to determine
how anomalous new events are.

{anomaly-detect-cap} results are written for each <<ml-buckets,bucket span>>.
{anomaly-detect-cap} results are written for each <<bucket-span,bucket span>>.
These results include scores that are aggregated in order to reduce noise and
normalized in order to rank the most mathematically significant anomalies. For
more information, see <<ml-bucket-results>> and <<ml-influencer-results>>.
more information, see <<ml-ad-bucket-results>> and <<ml-ad-influencer-results>>.

[discrete]
[[ml-ad-define-problem]]
Expand All @@ -65,7 +65,7 @@ the type of anomalous behavior you want to detect.

[discrete]
[[ml-ad-setup]]
== 2. Set up environment
== 2. Set up the environment

If you want to use {ml-features}, there must be at least one {ml} node in
your cluster and all master-eligible nodes must have {ml} enabled. By default,
Expand All @@ -87,6 +87,10 @@ a {dfeed} will be required.
[[ml-ad-create-job]]
== 3. Create a job

include::ml-jobs.asciidoc[]

include::ml-datafeeds.asciidoc[]

//TBD: Abbreviate this information and mention Fleet integration packages

include::create-jobs.asciidoc[]
Expand Down Expand Up @@ -125,9 +129,46 @@ seconds for the {ml} analysis to generate initial results.
There are two tools for examining the results from {anomaly-jobs} in {kib}: the
**Anomaly Explorer** and the **Single Metric Viewer**.

[discrete]
[[ml-ad-bucket-results]]
=== Bucket results

include::ml-buckets.asciidoc[tag=bucket-results]

[discrete]
[[ml-ad-influencer-results]]
=== Influencer results

include::ml-influencers.asciidoc[tag=influencer-results]

[discrete]
[[ml-ad-model-snapshots]]
=== Model snapshots

include::ml-model-snapshots.asciidoc[]

[discrete]
[[ml-ad-tune]]
== 6. Tune the job

While your {anomaly-job} is open, you might find that you need to alter its
configuration or settings.

[discrete]
[[ml-ad-calendars]]
=== Calendars and scheduled events

include::ml-calendars.asciidoc[]

[discrete]
[[ml-ad-rules]]
=== Custom rules

include::ml-rules.asciidoc[]

[discrete]
[[ml-ad-forecast]]
== 6. Forecast future behavior
== 7. Forecast future behavior

After the {ml-features} create baselines of normal behavior for your data,
you can use that information to extrapolate future behavior.
Expand Down Expand Up @@ -167,7 +208,7 @@ different expiration period by using the `expires_in` parameter in the

[discrete]
[[ml-ad-close-job]]
== 7. Close the job
== 8. Close the job

include::stopping-ml.asciidoc[leveloffset=+1]

Expand All @@ -180,3 +221,9 @@ For more advanced settings and scenarios, see <<anomaly-examples>>.

Refer to <<anomaly-detection-scale>> to learn more about the particularities of
large {anomaly-jobs}.

[discrete]
[[further-reading]]
== Further reading

https://www.elastic.co/blog/interpretability-in-ml-identifying-anomalies-influencers-root-causes[Interpretability in ML: Identifying anomalies, influencers, and root causes]
2 changes: 1 addition & 1 deletion docs/en/stack/ml/anomaly-detection/ml-ad-overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ pulled from {es} for analysis and anomaly results are displayed in {kib}
dashboards. Consult <<setup>> to learn more about the licence and the security
privileges that are required to use {anomaly-detect}.

* <<ml-concepts>>
* <<ml-ad-finding-anomalies>>
* <<ml-ad-concepts>>
* <<ml-configuration>>
* <<ml-api-quickref>>
Expand Down
14 changes: 5 additions & 9 deletions docs/en/stack/ml/anomaly-detection/ml-buckets.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
[role="xpack"]
[[ml-buckets]]
= Buckets

tag::buckets[]
The {ml-features} use the concept of a _bucket_ to divide the time series into
batches for processing.

Expand All @@ -17,11 +14,9 @@ The bucket span has two purposes: it dictates over what time span to look for an

The bucket span has a significant impact on the analysis. When you’re trying to determine what value to use, take into account the granularity at which you want to perform the analysis, the frequency of the input data, the duration of typical anomalies, and the frequency at which alerting is required.
////
end::buckets[]

[discrete]
[[ml-bucket-results]]
== Bucket results

tag::bucket-results[]
When you view your {ml} results, each bucket has an anomaly score. This score is
a statistically aggregated and normalized view of the combined anomalousness of
all the record results in the bucket.
Expand Down Expand Up @@ -52,4 +47,5 @@ Bucket results provide the top level, overall view of the {anomaly-job} and are
ideal for alerts. For example, the bucket results might indicate that at 16:05
the system was unusual. This information is a summary of all the anomalies,
pinpointing when they occurred. When you identify an anomalous bucket, you can
investigate further by examining the pertinent records.
investigate further by examining the pertinent records.
end::bucket-results[]
4 changes: 0 additions & 4 deletions docs/en/stack/ml/anomaly-detection/ml-calendars.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
[role="xpack"]
[[ml-calendars]]
= Calendars and scheduled events

Sometimes there are periods when you expect unusual activity to take place,
such as bank holidays, "Black Friday", or planned system outages. If you
identify these events in advance, no anomalies are generated during that period.
Expand Down
14 changes: 0 additions & 14 deletions docs/en/stack/ml/anomaly-detection/ml-concepts.asciidoc

This file was deleted.

27 changes: 0 additions & 27 deletions docs/en/stack/ml/anomaly-detection/ml-configuration.asciidoc

This file was deleted.

4 changes: 0 additions & 4 deletions docs/en/stack/ml/anomaly-detection/ml-datafeeds.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
[role="xpack"]
[[ml-datafeeds]]
= {dfeeds-cap}

{anomaly-jobs-cap} can analyze data that is stored in {es} or data that is
sent from some other source via an API. _{dfeeds-cap}_ retrieve data from {es}
for analysis, which is the simpler and more common scenario.
Expand Down
16 changes: 5 additions & 11 deletions docs/en/stack/ml/anomaly-detection/ml-influencers.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
[role="xpack"]
[[ml-influencers]]
= Influencers

tag::influencers[]
When anomalous events occur, we want to know why. To determine the cause,
however, you often need a broader knowledge of the domain. If you have
suspicions about which entities in your dataset are likely causing
Expand Down Expand Up @@ -38,10 +35,10 @@ TIP: As a best practice, do not pick too many influencers. For example, you
generally do not need more than three. If you pick many influencers, the results
can be overwhelming and there is a small overhead to the analysis.

[discrete]
[[ml-influencer-results]]
== Influencer results

end::influencers[]

tag::influencer-results[]
The influencer results show which entities were anomalous and when. One
influencer result is written per bucket for each influencer that affects the
anomalousness of the bucket. The {ml} analytics determine the impact of an
Expand Down Expand Up @@ -83,7 +80,4 @@ bucket-level anomaly scores. If you view swim lanes by influencer, it uses the
influencer-level anomaly scores, as does the list of top influencers. The list
of anomalies uses the record-level anomaly scores.

[[further-reading]]
== Further reading

https://www.elastic.co/blog/interpretability-in-ml-identifying-anomalies-influencers-root-causes[Interpretability in ML: Identifying anomalies, influencers, and root causes]
end::influencer-results[]
7 changes: 0 additions & 7 deletions docs/en/stack/ml/anomaly-detection/ml-jobs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
[role="xpack"]
[[ml-jobs]]
= {anomaly-jobs-cap}
++++
<titleabbrev>Jobs</titleabbrev>
++++

{anomaly-jobs-cap} contain the configuration information and metadata
necessary to perform an analytics task.

Expand Down
10 changes: 3 additions & 7 deletions docs/en/stack/ml/anomaly-detection/ml-model-snapshots.asciidoc
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
[role="xpack"]
[[ml-model-snapshots]]
= Model snapshots

As described in <<ml-analyzing>>, {stack} {ml-features} can calculate baselines
of normal behavior then extrapolate anomalous events. These baselines are
accomplished by generating models of your data.
{stack} {ml-features} calculate baselines of normal behavior then extrapolate
anomalous events. These baselines are accomplished by generating models of your
data.

To ensure resilience in the event of a system failure, snapshots of the {ml}
model for each {anomaly-job} are saved to an internal index within the {es}
Expand Down
14 changes: 5 additions & 9 deletions docs/en/stack/ml/anomaly-detection/ml-rules.asciidoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
[role="xpack"]
[[ml-rules]]
= Custom rules

By default, as described in <<ml-analyzing>>, anomaly detection is unsupervised
and the {ml} models have no awareness of the domain of your data. As a result,
{anomaly-jobs} might identify events that are statistically significant but are
uninteresting when you know the larger context. Machine learning custom rules
enable you to customize anomaly detection.
By default, {anomaly-detect} is unsupervised and the {ml} models have no
awareness of the domain of your data. As a result, {anomaly-jobs} might
identify events that are statistically significant but are uninteresting when
you know the larger context. {ml-cap} custom rules enable you to customize
{anomaly-detect}.

_Custom rules_ – or _job rules_ as {kib} refers to them – instruct anomaly
detectors to change their behavior based on domain-specific knowledge that you
Expand Down
2 changes: 1 addition & 1 deletion docs/en/stack/ml/get-started/ml-gs-results.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ that influences or contributes to anomalies. There are influencers in both the
As a best practice, do not pick too many influencers. For example, you generally
do not need more than three. If you pick many influencers, the results can be
overwhelming and there is a small overhead to the analysis. For more details,
see <<ml-influencers>>.
see <<ml-ad-influencers>>.
****

Expand Down
50 changes: 50 additions & 0 deletions docs/en/stack/ml/redirects.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,53 @@ This content has moved. See <<ml-ad-restart-failed-jobs>>.
=== Configure {anomaly-detect}

This content has moved. See <<ml-ad-setup>>.

[role="exclude",id="ml-jobs"]
=== {anomaly-jobs-cap}

This content has moved. See <<ml-ad-create-job>>.

[role="exclude",id="ml-datafeeds"]
=== {dfeeds-cap}

This content has moved. See <<ml-ad-create-job>>.

[role="exclude",id="ml-buckets"]
=== Buckets

This content has moved. See <<ml-ad-create-job>>.

[role="exclude",id="ml-bucket-results"]
=== Bucket results

This content has moved. See <<ml-ad-bucket-results>>.

[role="exclude",id="ml-influencers"]
=== Influencers

This content has moved. See <<ml-ad-influencers>>.

[role="exclude",id="ml-influencer-results"]
=== Influencer results

This content has moved. See <<ml-ad-influencer-results>>.

[role="exclude",id="ml-calendars"]
=== Calendars and scheduled events

This content has moved. See <<ml-ad-calendars>>.

[role="exclude",id="ml-rules"]
=== Custom rules

This content has moved. See <<ml-ad-rules>>.

[role="exclude",id="ml-model-snapshots"]
=== Model snapshots

This content has moved. See <<ml-ad-model-snapshots>>.

[role="exclude",id="ml-concepts"]
=== Concepts

This content has moved. See <<ml-ad-finding-anomalies>>.

0 comments on commit e760058

Please sign in to comment.