Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Enhancements to model memory estimation #60386

Closed
7 tasks done
darnautov opened this issue Mar 17, 2020 · 2 comments
Closed
7 tasks done

[ML] Enhancements to model memory estimation #60386

darnautov opened this issue Mar 17, 2020 · 2 comments
Assignees
Labels
Meta needs-team Issues missing a team label v7.7.0

Comments

@darnautov
Copy link
Contributor

darnautov commented Mar 17, 2020

Meta

  • Update /api/ml/validate/calculate_model_memory_limit Kibana endpoint to call the new Elasticsearch endpoint, removing the calculation in ml/server/models/calculate_model_memory_limit/calculate_model_memory_limit.js. ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • Modify calculateModelMemoryLimit function to be supplied with the analysis_config object ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • The calculateModelMemoryLimit function run queries to obtain the overall cardinality of the partitioning fields, and the max bucket cardinality of any influencer fields, to pass on to the new elasticsearch endpoint. ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • Check / fix that the multi-metric wizard calls the model memory limit endpoint when influencers are added or removed ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • For data recognizer modules, the /api/ml/modules/setup/{moduleId} endpoint will take an additional parameter to indicate whether an estimate of the model memory limit should be made by checking the cardinality of fields in the job configurations. When called from the ML data recognizer wizard, this will be true, making use of the start / end times specified for the data feed. If the setup endpoint is not supplied with start/end times, the calculateModelMemoryLimit endpoint will attempt to check over the most recent 3 months of data. Solutions calling the setup endpoint would be expected to pass true in most cases. If they wish to supply their own estimates, this can be done in the jobOverrides parameter. If there is no data in the datafeed index(es), or if false is passed to the setup endpoint, the existing model_memory_limit values supplied in the module job JSON configuration files will be used. ([ML] Module setup with dynamic model memory estimation #60656)
  • The code should be smart when needing to obtain model memory limits for a module which contains multiple jobs if the jobs share common partitioning and/or influencer fields, to minimize the number of requests to obtain cardinality. ([ML] Module setup with dynamic model memory estimation #60656)
  • All the wizards i.e. single metric, population, advanced and categorization to use model memory estimation ([ML] Wizards with dynamic model memory estimation #60888)
@darnautov darnautov added enhancement New value added to drive a business result :ml Feature:Anomaly Detection ML anomaly detection v7.7.0 labels Mar 17, 2020
@darnautov darnautov self-assigned this Mar 17, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@ghost
Copy link

ghost commented Mar 18, 2020

Meta

  • Update /api/ml/validate/calculate_model_memory_limit Kibana endpoint to call the new Elasticsearch endpoint, removing the calculation in ml/server/models/calculate_model_memory_limit/calculate_model_memory_limit.js. ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • Modify calculateModelMemoryLimit function to be supplied with the analysis_config object ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • The calculateModelMemoryLimit function run queries to obtain the overall cardinality of the partitioning fields, and the max bucket cardinality of any influencer fields, to pass on to the new elasticsearch endpoint. ([ML] Use a new ML endpoint to estimate a model memory #60376)
  • All the wizards i.e. single metric, population, advanced and categorization to use model memory estimation
  • Check / fix that the multi metric wizard calls the model memory limit endpoint when influencers are added or removed
  • For data recognizer modules, the /api/ml/modules/setup/{moduleId} endpoint will take an additional parameter to indicate whether an estimate of the model memory limit should be made by checking the cardinality of fields in the job configurations. When called from the ML data recognizer wizard, this will be true, making use of the start / end times specified for the data feed. If the setup endpoint is not supplied with start/end times, the calculateModelMemoryLimit endpoint will attempt to check over the most recent 3 months of data. Solutions calling the setup endpoint would be expected to pass true in most cases. If they wish to supply their own estimates, this can be done in the jobOverrides parameter. If there is no data in the datafeed index(es), or if false is passed to the setup endpoint, the existing model_memory_limit values supplied in the module job JSON configuration files will be used.
  • The code should be smart when needing to obtain model memory limits for a module which contains multiple jobs if the jobs share common partitioning and/or influencer fields, to minimize the number of requests to obtain cardinality.
  • QA tests are added to call the Kibana /api/ml/validate/calculate_model_memory_limit endpoint to verify the estimates returned for a range of job configurations using a variety of large data sets

@sophiec20 sophiec20 removed enhancement New value added to drive a business result :ml Feature:Anomaly Detection ML anomaly detection labels Nov 8, 2022
@botelastic botelastic bot added the needs-team Issues missing a team label label Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta needs-team Issues missing a team label v7.7.0
Projects
None yet
Development

No branches or pull requests

4 participants