Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] DFA explain API throws error when called called with data stream containing counter fields #94691

Open
jgowdyelastic opened this issue Mar 23, 2023 · 1 comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team

Comments

@jgowdyelastic
Copy link
Member

jgowdyelastic commented Mar 23, 2023

The DFA UI calls the endpoint _ml/data_frame/analytics/_explain for all analysis types. For outlier detection, no fields are supplied and the payload looks something like this:

{
  "description": "",
  "source": {
    "index": "metrics-weather_sensors-dev",
    "query": {
      "match_all": {}
    }
  },
  "analysis": {
    "outlier_detection": {
      "compute_feature_influence": "true",
      "standardization_enabled": "true"
    }
  },
  "max_num_threads": 1
}

If the index is a data stream which contains a counter field, an error is returned:

[memory_usage_estimation_137565] Unable to estimate memory usage as no documents in the source indices [metrics-weather_sensors-dev] contained all the fields selected for analysis. If you are relying on automatic field selection then there are currently mapped fields that do not exist in any indexed documents, and you will have to switch to explicit field selection and include only fields that exist in indexed documents.]: [memory_usage_estimation_137565] Unable to estimate memory usage as no documents in the source indices [metrics-weather_sensors-dev] contained all the fields selected for analysis. If you are relying on automatic field selection then there are currently mapped fields that do not exist in any indexed documents, and you will have to switch to explicit field selection and include only fields that exist in indexed documents.


This original description is wrong. The data being used when I originally noticed this had a counter field in the mappings, but it was not populated in the data.
When the counter field is populated, the error above does not happen. I don't know if it means this is still a problem, I would assume unpopulated counter fields should not cause the endpoint to throw an error?

In addition to this, currently the explain endpoint returns is_included: true for counter fields, is this correct?

@jgowdyelastic jgowdyelastic added the :ml Machine learning label Mar 23, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Mar 23, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

jgowdyelastic added a commit to elastic/kibana that referenced this issue Apr 3, 2023
Further improvements to the anomaly detection and data frame analysts
wizards to ensure counter fields cannot be selected when configuring a
new job.
Adds a `counter` flag to the `new_job_caps` response which is used to
remove counter fields from dropdowns in the DFA and advanced AD wizards.

The outlier detection wizard still has an issue caused by the call to
the `_explain` endpoint. This needs to be fixed in elasticsearch
elastic/elasticsearch#94691
@sophiec20 sophiec20 added the >bug label Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

3 participants