Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML][DOCS] Add documentation for detector rules and filters #32013

Conversation

dimitris-athanasiou
Copy link
Contributor

No description provided.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

A filter resource has the following properties:

`filter_id`::
(string) A string that uniquely identifies the calendar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be "filter" instead of "calendar"

==== Path Parameters

`filter_id` (required)::
(string) Identifier for the calendar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, I think this should be "filter" instead of "calendar".

===== Description

A filter contains a list of strings and can
be referenced by ML anomaly detectors' `custom_rules`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend replacing "ML" with "{ml}" so it's spelled out.

<titleabbrev>Update Filter</titleabbrev>
++++

Posts scheduled events in a calendar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this description should be something like "Updates properties of a filter".

partition field value is in a filter sounds equivalent to having a query
that filters out such documents. But it is not. There is a fundamental
difference. When the data is filtered before reaching a job it is as if they
do never existed for the job. With rules, the data still reaches the job and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo "do never"

that filters out such documents. But it is not. There is a fundamental
difference. When the data is filtered before reaching a job it is as if they
do never existed for the job. With rules, the data still reaches the job and
depending the rule actions they are affecting the job.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear. Maybe change to "... data still reaches the job and can affect the results (depending on the rule actions)".

`skip_result`::: the result will not be created. Note this also means this result
will not impact the scoring of results. The model be normally updated with the
corresponding series value. This is the default action and it serves to skip
results that undesired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this also means this result will not impact the scoring of results. - what does this mean?
The model be normally -> The model *will* be normally or updated as normal
results that undesired -> results that are undesired

corresponding series value. This is the default action and it serves to skip
results that undesired.
`skip_model_update`::: the value for that series will not be used to update the model.
If there is an anomalous record for that series it will be normally created. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an anomalous record for that series it will be normally created. - I think you can delete this line and the paragraph makes more sense

Can you add a note about using skip_model_update and skip_results in union:
'skip_results and skip_model_update may be used together which prevents the model from updating and will not generate anomaly results. If skip_model_update is used without skip_results then the anomalies are created but the model does not learn changing behaviour'

To add a scope for a field add the field name as a key in the scope object and
set its value to an object with properties:
`filter_id`::
(string) The id of the filter to be used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to the filter resource page so the reader knows what a filter is.

--------------------------------------------------
// CONSOLE

When the filter is created, you receive the following results:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need results it's confusing as we often refer to anomaly results

==== Request Body

`description`::
(string) A description for the filter. See <<ml-event-resource>>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ml-event-resource -> ml-filter-resource

This is useful when we want the rule to apply to a range. We simply create
a rule with two conditions, one for each end of the desired range.

Here is an example where a cound detector will skip results when the count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cound -> count


==== Rules in the life-cycle of a job

Rules only apply for results created after the rules were created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 createds close together

Rules only affect results created after the rules were applied.

@droberts195
Copy link
Contributor

In the code we have:

    /**
     * Functions that do not support rule conditions:
     * <ul>
     * <li>lat_long - because it is a multivariate feature
     * <li>metric - because having the same conditions on min,max,mean is
     * error-prone
     * <li>rare - because the actual/typical value is not something a user can anticipate
     * <li>freq_rare - because the actual/typical value is not something a user can anticipate
     * </ul>
     */
    static final EnumSet<DetectorFunction> FUNCTIONS_WITHOUT_RULE_CONDITION_SUPPORT = EnumSet.of(
            DetectorFunction.LAT_LONG, DetectorFunction.METRIC, DetectorFunction.RARE, DetectorFunction.FREQ_RARE);

I couldn't see this in the docs in this PR. Does it need to be added or is it somewhere else?

@@ -15,6 +15,9 @@ The {xpackml} features include the following metric functions:
* <<ml-metric-metric,`metric`>>
* xref:ml-metric-varp[`varp`, `high_varp`, `low_varp`]

NOTE: You cannot add rules with conditions to detectors that use metric
functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be:

NOTE: You cannot add rules with conditions to detectors that use the metric
function.

min, max and mean are metric functions, and they conditions should be fine with them. The reason conditions aren't allowed with metric is that metric looks at all 3 of min, max and mean, and it would be very hard to reason about which of the 3 statistics the condition had been applied to.


Rules and filters enable you to change the behavior of anomaly detectors based
on domain-specific knowledge.
//TO-DO: Add link to rules overview page
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this already? If not, what do we refer to by rules overview page?

@dimitris-athanasiou dimitris-athanasiou force-pushed the docs-for-ml-rules-and-filters branch 2 times, most recently from e554890 to 4c11131 Compare July 24, 2018 18:03
Copy link
Contributor

@lcawl lcawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lcawl
Copy link
Contributor

lcawl commented Jul 24, 2018

retest this please

1 similar comment
@lcawl
Copy link
Contributor

lcawl commented Jul 24, 2018

retest this please

@dimitris-athanasiou dimitris-athanasiou force-pushed the docs-for-ml-rules-and-filters branch from 4c11131 to fc47818 Compare July 25, 2018 09:59
@dimitris-athanasiou
Copy link
Contributor Author

retest this please

@dimitris-athanasiou dimitris-athanasiou merged commit 9a7a649 into elastic:master Jul 25, 2018
@dimitris-athanasiou dimitris-athanasiou deleted the docs-for-ml-rules-and-filters branch July 25, 2018 15:10
dnhatn added a commit that referenced this pull request Jul 26, 2018
* master:
  [DOCS] Fix formatting error in Slack action
  Painless: Fix documentation links to use existing refs (#32335)
  Painless: Decouple PainlessLookupBuilder and Whitelists (#32346)
  [DOCS] Adds recommendation for xpack.security.enabled (#32345)
  [TEST] Mute ConvertProcessortTests.testConvertIntHexError
  [TEST] Fix failure due to exception message in java11 (#32321)
  [DOCS] Fixes typo in ML aggregations page
  [DOCS] Adds link from bucket_span property to common time units
  [ML][DOCS] Add documentation for detector rules and filters (#32013)
  Add opaque_id to index audit logging (#32260)
  Add 6.5.0 version to master
  fixes broken build for third-party-tests (#32353)
dnhatn added a commit that referenced this pull request Jul 27, 2018
* 6.x:
  Only enforce password hashing check if FIPS enabled (#32383)
  Introduce fips_mode setting and associated checks (#32326)
  [DOCS] Fix formatting error in Slack action
  Ingest: Support integer and long hex values in convert (#32213)
  Release pipelined request in netty server tests (#32368)
  Add opaque_id to index audit logging (#32260)
  Painless: Fix documentation links to use existing refs (#32335)
  Painless: Decouple PainlessLookupBuilder and Whitelists (#32346)
  [DOCS] Adds recommendation for xpack.security.enabled (#32345)
  [test] package pre-install java check (#32259)
  [DOCS] Adds link from bucket_span property to common time units
  [DOCS] Fixes typo in ML aggregations page
  [ML][DOCS] Add documentation for detector rules and filters (#32013)
  Bump the 6.x branch to 6.5.0 (#32361)
  fixes broken build repository-s3 for third-party-tests
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants