Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Improve support for script and aggregation fields in anomaly detection jobs #81923

Merged
merged 30 commits into from
Nov 17, 2020

Conversation

qn895
Copy link
Member

@qn895 qn895 commented Oct 28, 2020

Summary

This PR improves support for anomaly charts in the Anomaly Explorer and Single Metric Viewer for jobs which use scripted or aggregation fields in their datafeed configuration and where model plot is not enabled.

Before
Screen Shot 2020-10-28 at 12 38 56

After
Screen Shot 2020-10-28 at 11 03 37

Screen Shot 2020-10-28 at 11 03 15

  • Fix Anomalies chart doesn't plot values for datafeeds comprising scripted fields

Screen Shot 2020-10-28 at 11 04 18

  • Fix datafeed preview to work when something else other than buckets is used for aggregation

  • Fix anomalies chart advanced wizard not validating correctly for datafeed configs with nested aggregations

  • Also contains a fix for anomalies not showing in Anomaly Explorer charts when no influencer detected:

Before
image

After
Screen Shot 2020-11-04 at 15 55 59

  • Added validation if summary count field is missing if datafeed config has aggregation fields

Screen Shot 2020-11-05 at 12 05 55

Screen Shot 2020-11-05 at 12 32 32

Note that the following items will be addressed in a follow-up:

  • Fix anomalies chart advanced wizard to work when something else other than buckets is used
  • Disable links to the Single Metric Viewer for datafeed configs which use aggregations with nested terms for which the plot of metric data is not supported.
  • Provide better messaging in the Anomaly Explorer for datafeed configs which use aggregations with nested terms for which the plot of metric data is not supported.

Checklist

Delete any items that are not applicable to this PR.

@qn895 qn895 marked this pull request as ready for review November 4, 2020 15:04
@qn895 qn895 requested a review from a team as a code owner November 4, 2020 15:04
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few minor comments, but generally LGTM
I'll approve once i've tested it.

@@ -32,7 +32,11 @@ import {
FieldHistogramRequestConfig,
FieldRequestConfig,
} from '../../datavisualizer/index_based/common';
import { DataRecognizerConfigResponse, Module } from '../../../../common/types/modules';
import {
DatafeedOverride,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatafeedOverride is for overriding the datafeed config when calling the module setup and so i don't think should be used here.
It looks like the standard Datafeed interface would be better suited throughout this PR.

@@ -415,7 +415,7 @@ export function resultsServiceProvider(mlApiServices) {
influencerFieldValues: {
terms: {
field: 'influencer_field_value',
size: maxResults !== undefined ? maxResults : 2,
size: !!maxResults ? maxResults : 2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is subtle but using a falsey check like !! rather than an explicit check for undefined changes its behaviour.
if maxResults is 0 it would now default to 2
Not that 0 is a sensible number, it's just generally dangerous IMO using falsey checks for variables that are numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had to update this to using !! because in the case maxResults = 0, the size will be 0 and will make the query invalid.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here 7355559

): Promise<string[]> {
const { body } = await asCurrentUser.fieldCaps({
index,
fields: fieldNames,
});
const aggregatableFields: string[] = [];
const datafeedAggConfig = datafeedConfig?.aggregations ?? datafeedConfig?.aggs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, datafeedAggregations or even just aggregations might be a better name for this as datafeedAggConfig is a bit too similar to datafeedConfig. I missed the difference when first reading it.
The same in other files in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here f54c8e8


// if datafeed has aggregation, require job config to include a valid summary_doc_field_name
const datafeedAggConfig = job.datafeed_config?.aggregations ?? job.datafeed_config?.aggs;
if (datafeedAggConfig !== undefined && !job.analysis_config?.summary_count_field_name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the two conditions here, one checks for undefined and one checks for !.
Is there a reason for the difference? or could they both be undefined checks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this was two ifs combined based on a previous comment.
being a fussy nitpicker i think they should be the same so not to raised questions further down the line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here f54c8e8

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jgowdyelastic jgowdyelastic self-requested a review November 11, 2020 21:24
@qn895
Copy link
Member Author

qn895 commented Nov 17, 2020

@elasticmachine merge upstream

@peteharverson peteharverson changed the title [ML] Add support for script and aggregation fields to Anomaly Detection [ML] Improve support for script and aggregation fields in anomaly detection jobs Nov 17, 2020
Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and LGTM.

In the follow-up we should try and disable links to the Single Metric Viewer for the configs which we don't support and provide e.g. a callout in the Anomaly Explorer to explain why we aren't displaying anomaly charts.

@qn895
Copy link
Member Author

qn895 commented Nov 17, 2020

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
ml 1564 1565 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 5.2MB 5.2MB +5.9KB

Distributable file count

id before after diff
default 42848 42849 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@qn895 qn895 merged commit 55119c2 into elastic:master Nov 17, 2020
@qn895 qn895 deleted the ml-fix-script-n-aggregation-fields branch November 17, 2020 17:34
qn895 added a commit to qn895/kibana that referenced this pull request Nov 17, 2020
qn895 added a commit that referenced this pull request Nov 17, 2020
…ly detection jobs (#81923) (#83569)

Co-authored-by: Kibana Machine <[email protected]>

Co-authored-by: Kibana Machine <[email protected]>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Nov 17, 2020
* master: (51 commits)
  [ML] Persisted URL state for the Data frame analytics jobs and models pages (elastic#83439)
  adds xpack.security.authc.selector.enabled setting (elastic#83551)
  skip flaky suite (elastic#77279)
  [ML] Improve support for script and aggregation fields in anomaly detection jobs (elastic#81923)
  [Workplace Search] Migrate SourcesLogic from ent-search (elastic#83544)
  [ML] Add UI test for feature importance features (elastic#82677)
  [Maps] Improve icons for all layer types (elastic#83503)
  Replace experimental badge with Beta (elastic#83468)
  [Fleet][EPM] Unified install and archive (elastic#83384)
  Move src/legacy/server/keystore to src/cli (elastic#83483)
  Used SO for saving the API key IDs that should be deleted (elastic#82211)
  [Uptime] Mock implementation to account for math flakiness test (elastic#83535)
  [Workplace Search] Enable check for org context based on URL (elastic#83487)
  [App Search] Added all Document related routes and logic (elastic#83324)
  [Alerting UI] Fix console error when setting connector params (elastic#83333)
  [Discover] Allow custom name for fields via index pattern field management (elastic#70039)
  [Uptime] Fix monitor list down histogram (elastic#83411)
  remove headers timeout hack, rely on nodejs timeouts (elastic#83419)
  [ML] Update console autocomplete for ML data frame evaluate API (elastic#83151)
  [Lens] Color in dimension trigger (elastic#76871)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants