[DOCS] Moves anomaly detection concepts (#1750)

elastic · Jul 9, 2021 · e760058 · e760058
1 parent f448a28
commit e760058
Show file tree

Hide file tree

Showing 15 changed files with 125 additions and 116 deletions.
diff --git a/docs/en/stack/ml/anomaly-detection/index.asciidoc b/docs/en/stack/ml/anomaly-detection/index.asciidoc
@@ -1,21 +1,5 @@
 include::ml-ad-overview.asciidoc[]
 
-include::ml-concepts.asciidoc[leveloffset=+1]
-
-include::ml-jobs.asciidoc[leveloffset=+2]
-
-include::ml-datafeeds.asciidoc[leveloffset=+2]
-
-include::ml-buckets.asciidoc[leveloffset=+2]
-
-include::ml-influencers.asciidoc[leveloffset=+2]
-
-include::ml-calendars.asciidoc[leveloffset=+2]
-
-include::ml-rules.asciidoc[leveloffset=+2]
-
-include::ml-model-snapshots.asciidoc[leveloffset=+2]
-
 include::ml-ad-finding-anomalies.asciidoc[leveloffset=+1]
 
 include::ml-ad-concepts.asciidoc[leveloffset=+1]

diff --git a/docs/en/stack/ml/anomaly-detection/job-tips.asciidoc b/docs/en/stack/ml/anomaly-detection/job-tips.asciidoc
@@ -7,6 +7,8 @@ results.
 [[ml-ad-bucket-span]]
 == Bucket span
 
+include::ml-buckets.asciidoc[tag=buckets]
+
 The bucket span is the time interval that {ml} analytics use to summarize and
 model data for your job. When you create an {anomaly-job} in {kib}, you can
 choose to estimate a bucket span value based on your data characteristics. 
@@ -53,7 +55,7 @@ duplicates if they have the same `function`, `field_name`, `by_field_name`,
 [[ml-ad-influencers]]
 == Influencers
 
-See <<ml-influencers>>.
+include::ml-influencers.asciidoc[tag=influencers]
 
 
 [discrete]

diff --git a/docs/en/stack/ml/anomaly-detection/ml-ad-finding-anomalies.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-ad-finding-anomalies.asciidoc
@@ -44,10 +44,10 @@ data by observing historical behavior and adapting to new data. The model
 represents a baseline of normal behavior and can therefore be used to determine
 how anomalous new events are.
 
-{anomaly-detect-cap} results are written for each <<ml-buckets,bucket span>>.
+{anomaly-detect-cap} results are written for each <<bucket-span,bucket span>>.
 These results include scores that are aggregated in order to reduce noise and
 normalized in order to rank the most mathematically significant anomalies. For
-more information, see <<ml-bucket-results>> and <<ml-influencer-results>>.
+more information, see <<ml-ad-bucket-results>> and <<ml-ad-influencer-results>>.
 
 [discrete]
 [[ml-ad-define-problem]]
@@ -65,7 +65,7 @@ the type of anomalous behavior you want to detect.
 
 [discrete]
 [[ml-ad-setup]]
-== 2. Set up environment
+== 2. Set up the environment
 
 If you want to use {ml-features}, there must be at least one {ml} node in
 your cluster and all master-eligible nodes must have {ml} enabled. By default,
@@ -87,6 +87,10 @@ a {dfeed} will be required.
 [[ml-ad-create-job]]
 == 3. Create a job
 
+include::ml-jobs.asciidoc[]
+
+include::ml-datafeeds.asciidoc[]
+
 //TBD: Abbreviate this information and mention Fleet integration packages
 
 include::create-jobs.asciidoc[]
@@ -125,9 +129,46 @@ seconds for the {ml} analysis to generate initial results.
 There are two tools for examining the results from {anomaly-jobs} in {kib}: the
 **Anomaly Explorer** and the **Single Metric Viewer**.
 
+[discrete]
+[[ml-ad-bucket-results]]
+=== Bucket results
+
+include::ml-buckets.asciidoc[tag=bucket-results]
+
+[discrete]
+[[ml-ad-influencer-results]]
+=== Influencer results
+
+include::ml-influencers.asciidoc[tag=influencer-results]
+
+[discrete]
+[[ml-ad-model-snapshots]]
+=== Model snapshots
+
+include::ml-model-snapshots.asciidoc[]
+
+[discrete]
+[[ml-ad-tune]]
+== 6. Tune the job
+
+While your {anomaly-job} is open, you might find that you need to alter its
+configuration or settings.
+
+[discrete]
+[[ml-ad-calendars]]
+=== Calendars and scheduled events
+
+include::ml-calendars.asciidoc[]
+
+[discrete]
+[[ml-ad-rules]]
+=== Custom rules
+
+include::ml-rules.asciidoc[]
+
 [discrete]
 [[ml-ad-forecast]]
-== 6. Forecast future behavior
+== 7. Forecast future behavior
 
 After the {ml-features} create baselines of normal behavior for your data,
 you can use that information to extrapolate future behavior.
@@ -167,7 +208,7 @@ different expiration period by using the `expires_in` parameter in the
 
 [discrete]
 [[ml-ad-close-job]]
-== 7. Close the job
+== 8. Close the job
 
 include::stopping-ml.asciidoc[leveloffset=+1]
 
@@ -180,3 +221,9 @@ For more advanced settings and scenarios, see <<anomaly-examples>>.
 
 Refer to <<anomaly-detection-scale>> to learn more about the particularities of 
 large {anomaly-jobs}.
+
+[discrete]
+[[further-reading]]
+== Further reading
+
+https://www.elastic.co/blog/interpretability-in-ml-identifying-anomalies-influencers-root-causes[Interpretability in ML: Identifying anomalies, influencers, and root causes]
diff --git a/docs/en/stack/ml/anomaly-detection/ml-ad-overview.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-ad-overview.asciidoc
@@ -12,7 +12,7 @@ pulled from {es} for analysis and anomaly results are displayed in {kib}
 dashboards. Consult <<setup>> to learn more about the licence and the security 
 privileges that are required to use {anomaly-detect}.
 
-* <<ml-concepts>>
+* <<ml-ad-finding-anomalies>>
 * <<ml-ad-concepts>>
 * <<ml-configuration>>
 * <<ml-api-quickref>>

diff --git a/docs/en/stack/ml/anomaly-detection/ml-buckets.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-buckets.asciidoc
@@ -1,7 +1,4 @@
-[role="xpack"]
-[[ml-buckets]]
-= Buckets
-
+tag::buckets[]
 The {ml-features} use the concept of a _bucket_ to divide the time series into
 batches for processing.
 
@@ -17,11 +14,9 @@ The bucket span has two purposes: it dictates over what time span to look for an
 
 The bucket span has a significant impact on the analysis. When you’re trying to determine what value to use, take into account the granularity at which you want to perform the analysis, the frequency of the input data, the duration of typical anomalies, and the frequency at which alerting is required.
 ////
+end::buckets[]
 
-[discrete]
-[[ml-bucket-results]]
-== Bucket results
-
+tag::bucket-results[]
 When you view your {ml} results, each bucket has an anomaly score. This score is
 a statistically aggregated and normalized view of the combined anomalousness of
 all the record results in the bucket.
@@ -52,4 +47,5 @@ Bucket results provide the top level, overall view of the {anomaly-job} and are
 ideal for alerts. For example, the bucket results might indicate that at 16:05
 the system was unusual. This information is a summary of all the anomalies,
 pinpointing when they occurred. When you identify an anomalous bucket, you can
-investigate further by examining the pertinent records.
+investigate further by examining the pertinent records.
+end::bucket-results[]
diff --git a/docs/en/stack/ml/anomaly-detection/ml-calendars.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-calendars.asciidoc
@@ -1,7 +1,3 @@
-[role="xpack"]
-[[ml-calendars]]
-= Calendars and scheduled events
-
 Sometimes there are periods when you expect unusual activity to take place,
 such as bank holidays, "Black Friday", or planned system outages. If you
 identify these events in advance, no anomalies are generated during that period.

diff --git a/docs/en/stack/ml/anomaly-detection/ml-concepts.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-concepts.asciidoc
diff --git a/docs/en/stack/ml/anomaly-detection/ml-configuration.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-configuration.asciidoc
diff --git a/docs/en/stack/ml/anomaly-detection/ml-datafeeds.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-datafeeds.asciidoc
@@ -1,7 +1,3 @@
-[role="xpack"]
-[[ml-datafeeds]]
-= {dfeeds-cap}
-
 {anomaly-jobs-cap} can analyze data that is stored in {es} or data that is
 sent from some other source via an API. _{dfeeds-cap}_ retrieve data from {es}
 for analysis, which is the simpler and more common scenario.

diff --git a/docs/en/stack/ml/anomaly-detection/ml-influencers.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-influencers.asciidoc
@@ -1,7 +1,4 @@
-[role="xpack"]
-[[ml-influencers]]
-= Influencers
-
+tag::influencers[]
 When anomalous events occur, we want to know why. To determine the cause,
 however, you often need a broader knowledge of the domain. If you have
 suspicions about which entities in your dataset are likely causing
@@ -38,10 +35,10 @@ TIP: As a best practice, do not pick too many influencers. For example, you
 generally do not need more than three. If you pick many influencers, the results
 can be overwhelming and there is a small overhead to the analysis.
 
-[discrete]
-[[ml-influencer-results]]
-== Influencer results
 
+end::influencers[]
+
+tag::influencer-results[]
 The influencer results show which entities were anomalous and when. One
 influencer result is written per bucket for each influencer that affects the
 anomalousness of the bucket. The {ml} analytics determine the impact of an 
@@ -83,7 +80,4 @@ bucket-level anomaly scores. If you view swim lanes by influencer, it uses the
 influencer-level anomaly scores, as does the list of top influencers. The list
 of anomalies uses the record-level anomaly scores.
 
-[[further-reading]]
-== Further reading
-
-https://www.elastic.co/blog/interpretability-in-ml-identifying-anomalies-influencers-root-causes[Interpretability in ML: Identifying anomalies, influencers, and root causes]
+end::influencer-results[]
diff --git a/docs/en/stack/ml/anomaly-detection/ml-jobs.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-jobs.asciidoc
@@ -1,10 +1,3 @@
-[role="xpack"]
-[[ml-jobs]]
-= {anomaly-jobs-cap}
-++++
-<titleabbrev>Jobs</titleabbrev>
-++++
-
 {anomaly-jobs-cap} contain the configuration information and metadata
 necessary to perform an analytics task.
 

diff --git a/docs/en/stack/ml/anomaly-detection/ml-model-snapshots.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-model-snapshots.asciidoc
@@ -1,10 +1,6 @@
-[role="xpack"]
-[[ml-model-snapshots]]
-= Model snapshots
-
-As described in <<ml-analyzing>>, {stack} {ml-features} can calculate baselines
-of normal behavior then extrapolate anomalous events. These baselines are
-accomplished by generating models of your data. 
+{stack} {ml-features} calculate baselines of normal behavior then extrapolate
+anomalous events. These baselines are accomplished by generating models of your
+data. 
 
 To ensure resilience in the event of a system failure, snapshots of the {ml}
 model for each {anomaly-job} are saved to an internal index within the {es}

diff --git a/docs/en/stack/ml/anomaly-detection/ml-rules.asciidoc b/docs/en/stack/ml/anomaly-detection/ml-rules.asciidoc
@@ -1,12 +1,8 @@
-[role="xpack"]
-[[ml-rules]]
-= Custom rules
-
-By default, as described in <<ml-analyzing>>, anomaly detection is unsupervised 
-and the {ml} models have no awareness of the domain of your data. As a result, 
-{anomaly-jobs} might identify events that are statistically significant but are 
-uninteresting when you know the larger context. Machine learning custom rules
-enable you to customize anomaly detection. 
+By default, {anomaly-detect} is unsupervised and the {ml} models have no
+awareness of the domain of your data. As a result, {anomaly-jobs} might
+identify events that are statistically significant but are uninteresting when
+you know the larger context. {ml-cap} custom rules enable you to customize
+{anomaly-detect}. 
 
 _Custom rules_ – or _job rules_ as {kib} refers to them – instruct anomaly 
 detectors to change their behavior based on domain-specific knowledge that you 

diff --git a/docs/en/stack/ml/get-started/ml-gs-results.asciidoc b/docs/en/stack/ml/get-started/ml-gs-results.asciidoc
@@ -123,7 +123,7 @@ that influences or contributes to anomalies. There are influencers in both the
 As a best practice, do not pick too many influencers. For example, you generally
 do not need more than three. If you pick many influencers, the results can be
 overwhelming and there is a small overhead to the analysis. For more details,
-see <<ml-influencers>>.
+see <<ml-ad-influencers>>.
 
 ****
 

diff --git a/docs/en/stack/ml/redirects.asciidoc b/docs/en/stack/ml/redirects.asciidoc
@@ -63,3 +63,53 @@ This content has moved. See <<ml-ad-restart-failed-jobs>>.
 === Configure {anomaly-detect}
 
 This content has moved. See <<ml-ad-setup>>.
+
+[role="exclude",id="ml-jobs"]
+=== {anomaly-jobs-cap}
+
+This content has moved. See <<ml-ad-create-job>>.
+
+[role="exclude",id="ml-datafeeds"]
+=== {dfeeds-cap}
+
+This content has moved. See <<ml-ad-create-job>>.
+
+[role="exclude",id="ml-buckets"]
+=== Buckets
+
+This content has moved. See <<ml-ad-create-job>>.
+
+[role="exclude",id="ml-bucket-results"]
+=== Bucket results
+
+This content has moved. See <<ml-ad-bucket-results>>.
+
+[role="exclude",id="ml-influencers"]
+=== Influencers
+
+This content has moved. See <<ml-ad-influencers>>.
+
+[role="exclude",id="ml-influencer-results"]
+=== Influencer results
+
+This content has moved. See <<ml-ad-influencer-results>>.
+
+[role="exclude",id="ml-calendars"]
+=== Calendars and scheduled events
+
+This content has moved. See <<ml-ad-calendars>>.
+
+[role="exclude",id="ml-rules"]
+=== Custom rules
+
+This content has moved. See <<ml-ad-rules>>.
+
+[role="exclude",id="ml-model-snapshots"]
+=== Model snapshots
+
+This content has moved. See <<ml-ad-model-snapshots>>.
+
+[role="exclude",id="ml-concepts"]
+=== Concepts
+
+This content has moved. See <<ml-ad-finding-anomalies>>.