Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Optimize redundant formula function aggregations #135265

Open
Tracked by #153629
drewdaemon opened this issue Jun 27, 2022 · 4 comments
Open
Tracked by #153629

[Lens] Optimize redundant formula function aggregations #135265

drewdaemon opened this issue Jun 27, 2022 · 4 comments
Labels
Feature:Lens performance Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@drewdaemon
Copy link
Contributor

Describe the feature:
When a Lens formula calls the same function more than once, it is currently translated into two identical aggregations in the request to Elasticsearch. Elasticsearch does not optimize this, instead performing the exact same work as many times as there are aggregations.

#131875 added the flexibility to use a single Elasticsearch aggregation to power multiple Lens dimensions. It also introduced an expression optimization hook on the Operation class. We should be able to use this groundwork to merge all redundant formula function calls into a single aggregation request to Elasticsearch. This will improve performance and lessen cluster load.

Describe a specific use case for the feature:

Logistic function
Screen Shot 2022-06-27 at 4 27 00 PM

@drewdaemon drewdaemon added Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Jun 27, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-vis-editors @elastic/kibana-vis-editors-external (Team:VisEditors)

@drewdaemon
Copy link
Contributor Author

@flash1293 just to verify—this is the comprehensive list of operations that need to be optimized?

Screen Shot 2022-09-14 at 2 40 08 PM

@flash1293
Copy link
Contributor

Yes, that’s right. Technically an unfiltered count is “for free”, but I don’t see how this would simplify things. Also, we already have some special optimization for percentiles we should keep in mind.

@stratoula stratoula added the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Jan 10, 2023
@drewdaemon
Copy link
Contributor Author

I believe the only formula function left to optimize is percentile_rank. To optimize it, we need to follow a similar pattern to what is currently done for percentile.

  • implement getGroupByKey to deduplicate identical functions
  • use optimizeEsAggs method to merge multiple different percentile rank values into the same aggregation as long as all other parameters match

@markov00 markov00 removed the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Lens performance Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

5 participants