[KQL] Use cache and other performance improvements #93319

lukasolson · 2021-03-02T21:36:01Z

Summary

Resolves #76811.

This PR improves KQL parsing performance in the following ways:

Uses the --cache PEG.js parameter when generating the parser
Optimizes performance when autocomplete is unnecessary

Benchmarks from the above linked issue prior to this PR:

parse simple KQL x 2,431 ops/sec ±2.19% (86 runs sampled)
parse complex KQL x 9.64 ops/sec ±3.28% (29 runs sampled)

And after this PR:

parse simple KQL x 14,703 ops/sec ±15.90% (79 runs sampled)
parse complex KQL x 163 ops/sec ±6.21% (54 runs sampled)

elasticmachine · 2021-03-02T21:36:03Z

Pinging @elastic/kibana-app-services (Team:AppServices)

lukasolson · 2021-03-02T21:38:23Z

src/plugins/data/common/es_query/kuery/ast/kuery.peg

@@ -28,15 +28,15 @@ start
 OrQuery
  = &{ return errorOnLuceneSyntax; } LuceneQuery
  / left:AndQuery Or right:OrQuery {
-    const cursor = [left, right].find(node => node.type === 'cursor');
+    const cursor = parseCursor && [left, right].find(node => node.type === 'cursor');


parseCursor is an option passed to the parser which essentially specifies whether we're parsing for autocomplete suggestions or not. If false, then we can short-circuit any autocomplete logic (stuff where node.type === 'cursor').

lukasolson · 2021-03-02T21:42:09Z

src/plugins/data/common/es_query/kuery/ast/kuery.peg

@@ -209,7 +209,7 @@ Literal "literal"
  = QuotedString / UnquotedLiteral

 QuotedString
-  = '"' prefix:QuotedCharacter* cursor:Cursor suffix:QuotedCharacter* '"' {
+  = &{ return parseCursor; } '"' prefix:QuotedCharacter* cursor:Cursor suffix:QuotedCharacter* '"' {


See https://pegjs.org/documentation#grammar-syntax-and-semantics. This is a "predicate" which essentially does the same check as above before trying this grammar rule.

kobelb · 2021-03-02T21:52:43Z

These are super impressive improvements for the amount of effort. 🙇‍♂️💝

lukasolson · 2021-03-02T22:39:30Z

test/api_integration/apis/saved_objects/find.ts

-                  'whitespace but "<" found.\ndashboard.attributes.title:foo' +
-                  '<invalid\n------------------------------^: Bad Request',
+                  'KQLSyntaxError: Expected AND, OR, end of input but "<" found.\ndashboard.' +
+                  'attributes.title:foo<invalid\n------------------------------^: Bad Request',


Since we're now shortcutting the whitespace rule when parseCursor is false, it's no longer included in the error messages as one of the acceptable alternatives (which should have been the case to begin with).

lukasolson · 2021-03-03T18:56:32Z

Note: We should make sure that changing this option doesn't considerably increase heap usage for valid use cases (see pegjs/pegjs#590).

lukasolson · 2021-03-05T23:26:38Z

The cache is re-initialized for each expression. The heap at the end of the benchmarks is equivalent before and after, but the size of the cache itself after running the complicated expression is ~3.1 MB. This seems like an acceptable tradeoff for CPU time.

lukasolson · 2021-03-08T03:20:46Z

@elasticmachine merge upstream

kibanamachine · 2021-03-08T05:14:05Z

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`data`	815.5KB	825.1KB	+9.6KB

History

💚 Build #111561 succeeded 962e935
💚 Build #110934 succeeded 85fe23e
💚 Build #110561 succeeded 7e39ec5
💔 Build #110536 failed cfe914c

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @lukasolson

ppisljar

code LGTM

* [KQL] Use cache and other performance improvements * Fix test * Fix jest tests Co-authored-by: Kibana Machine <[email protected]>

* [KQL] Use cache and other performance improvements * Fix test * Fix jest tests Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Kibana Machine <[email protected]>

spalger · 2021-04-06T00:46:07Z

@lukasolson it's pretty hard to tell how this cache works, is it storing results for every filter value ever submitted in memory? Is the cache ever rotated or cleared? I'm not seeing anything like that in the generated code and this issue suggest that the cache will just grow and grow the more queries are parsed.

## Summary Resolves #143335. Some history: A similar issue was reported a few years back (#76811). The solution (#93319) was to use the `--cache` PEG.js [parameter](https://pegjs.org/documentation#generating-a-parser) when generating the parser. Back when this was added, we were still manually building the parser on demand when it was changed. Eventually we added support for dynamically building the parser during the build process (#145615). I'm not sure where along the process the `cache` parameter got lost but it didn't appear to be used when we switched. This PR re-adds this parameter which increases performance considerably (metrics shown in ops/sec): ``` Before using cache: ● kuery AST API › fromKueryExpression › performance › with simple expression Received: 7110.68990544415 ● kuery AST API › fromKueryExpression › performance › with complex expression Received: 40.51361746242248 ● kuery AST API › fromKueryExpression › performance › with many subqueries Received: 17.071767133068473 After using cache: ● kuery AST API › fromKueryExpression › performance › with simple expression Received: 8275.49109867502 ● kuery AST API › fromKueryExpression › performance › with complex expression Received: 447.0459218892934 ● kuery AST API › fromKueryExpression › performance › with many subqueries Received: 115852.43643466769 ``` ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

[KQL] Use cache and other performance improvements

cfe914c

lukasolson added review performance Feature:KQL KQL v8.0.0 Team:AppServices release_note:skip Skip the PR/issue when compiling release notes v7.12.0 v7.13.0 labels Mar 2, 2021

lukasolson self-assigned this Mar 2, 2021

lukasolson requested a review from a team as a code owner March 2, 2021 21:36

lukasolson commented Mar 2, 2021

View reviewed changes

Fix test

3870798

lukasolson commented Mar 2, 2021

View reviewed changes

Fix jest tests

7e39ec5

lukasolson added 2 commits March 3, 2021 15:47

Merge branch 'master' into fix/76811

85fe23e

Merge branch 'master' into fix/76811

962e935

Merge branch 'master' into fix/76811

dbcaf22

ppisljar approved these changes Mar 8, 2021

View reviewed changes

lukasolson merged commit 2b3bac9 into elastic:master Mar 8, 2021

lukasolson mentioned this pull request Mar 8, 2021

[7.x] [KQL] Use cache and other performance improvements (#93319) #93972

Merged

lukasolson mentioned this pull request Mar 8, 2021

[7.12] [KQL] Use cache and other performance improvements (#93319) #93973

Merged

lukasolson mentioned this pull request Apr 24, 2024

[KQL] Fix performance issue with nested subqueries #181208

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KQL] Use cache and other performance improvements #93319

[KQL] Use cache and other performance improvements #93319

lukasolson commented Mar 2, 2021

elasticmachine commented Mar 2, 2021

lukasolson Mar 2, 2021

lukasolson Mar 2, 2021

kobelb commented Mar 2, 2021

lukasolson Mar 2, 2021

lukasolson commented Mar 3, 2021

lukasolson commented Mar 5, 2021

lukasolson commented Mar 8, 2021

kibanamachine commented Mar 8, 2021

ppisljar left a comment

spalger commented Apr 6, 2021

[KQL] Use cache and other performance improvements #93319

[KQL] Use cache and other performance improvements #93319

Conversation

lukasolson commented Mar 2, 2021

Summary

elasticmachine commented Mar 2, 2021

lukasolson Mar 2, 2021

Choose a reason for hiding this comment

lukasolson Mar 2, 2021

Choose a reason for hiding this comment

kobelb commented Mar 2, 2021

lukasolson Mar 2, 2021

Choose a reason for hiding this comment

lukasolson commented Mar 3, 2021

lukasolson commented Mar 5, 2021

lukasolson commented Mar 8, 2021

kibanamachine commented Mar 8, 2021

💚 Build Succeeded

Metrics [docs]

Page load bundle

History

ppisljar left a comment

Choose a reason for hiding this comment

spalger commented Apr 6, 2021