Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] improve percentile agg optimizations #145303

Merged

Conversation

drewdaemon
Copy link
Contributor

@drewdaemon drewdaemon commented Nov 15, 2022

Summary

Part of #135265

Unfiltered single-percentile agg configs were already optimized.

As of this PR, filtered single-percentile aggs that use the same percentile (and other args) are now collapsed into one agg config.

Unfortunately, esaggs doesn't currently support combining filtered single-percentile aggs with different percentiles into a single agg config, so that special optimization is currently only applied on non-filtered single-percentile aggs.

Testing

Filtered percentiles with same value now deduplicated

Add a dimension with the following formula

percentile(bytes, percentile=2, kql='geo.dest : "AL" ') + 
percentile(bytes, percentile=2, kql='geo.dest : "AL" ') + 
percentile(bytes, percentile=2, kql='geo.dest : "BA" ') + 
percentile(bytes, percentile=2, kql='geo.dest : "BA" ') + 
percentile(bytes, percentile=2, kql='geo.dest : "BA" ')

Check the request. Should only have two aggs, one for each filter.

Unfiltered percentiles with different values still collapsed

Add a dimension with the following formula

percentile(bytes, percentile=2) + 
percentile(bytes, percentile=3) +
percentile(bytes, percentile=4) +
percentile(bytes, percentile=4) +
percentile(bytes, percentile=5)

Check the request. Agg should look like this

"0": {
      "percentiles": {
        "field": "bytes",
        "percents": [
          2,
          3,
          4,
          5,
        ],
        "keyed": false
      }
    }

@drewdaemon
Copy link
Contributor Author

@elasticmachine merge upstream

@elastic elastic deleted a comment from kibana-ci Nov 18, 2022
@drewdaemon drewdaemon changed the title improve percentile agg optimizations [Lens] improve percentile agg optimizations Nov 30, 2022
@drewdaemon drewdaemon added Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens release_note:skip Skip the PR/issue when compiling release notes labels Nov 30, 2022
@drewdaemon drewdaemon marked this pull request as ready for review November 30, 2022 01:21
@drewdaemon drewdaemon requested a review from a team as a code owner November 30, 2022 01:21
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)

@drewdaemon
Copy link
Contributor Author

@elasticmachine merge upstream

@@ -142,7 +142,7 @@ function getExpressionForLayer(
if (def.input !== 'fullReference' && def.input !== 'managedReference') {
const aggId = String(index);

const wrapInFilter = Boolean(def.filterable && col.filter);
const wrapInFilter = Boolean(def.filterable && col.filter?.query);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change prevents us wrapping this agg in a filter when the query is just an empty string. No difference on the elasticsearch query, but this does make sure that if someone removes a kql filter from a dimension, that agg can be collapsed into other matching aggs instead of being passed over because it looks like it has a query.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
lens 1.3MB 1.3MB -239.0B
Unknown metric groups

ESLint disabled in files

id before after diff
osquery 1 2 +1

ESLint disabled line counts

id before after diff
enterpriseSearch 19 21 +2
fleet 59 65 +6
osquery 109 115 +6
securitySolution 442 448 +6
total +20

Total ESLint disabled count

id before after diff
enterpriseSearch 20 22 +2
fleet 68 74 +6
osquery 110 117 +7
securitySolution 519 525 +6
total +21

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@dej611 dej611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally on Safari 👍

I had a go also with a mixed formula like:

defaults(percentile(bytes, percentile=2, kql='extension.keyword : CSS'), 0) + 
defaults(percentile(bytes, percentile=2, kql='extension.keyword : CSS'), 0) +
percentile(bytes, percentile=4) +
percentile(bytes, percentile=4) +
percentile(bytes, percentile=5)

And both 4th and 2nd percentile were aggregated into each own bucket.
Adding an extra percentile(bytes, percentile=2) will add the 2nd percentile metric into the non-filtered bucket correctly and group the other similar ones into the filtered bucket.

@drewdaemon drewdaemon merged commit 61a2df6 into elastic:main Nov 30, 2022
@kibanamachine kibanamachine added v8.7.0 backport:skip This commit does not require backporting labels Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Lens release_note:skip Skip the PR/issue when compiling release notes Team:Visualizations Visualization editors, elastic-charts and infrastructure v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants