-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Remove CircuitBreaker from parser #41835
Conversation
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: elastic#41471
Pinging @elastic/es-search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left one comment. It's not a definitive no-no, but I have doubts regarding the inclusion of the parsed tree in there.
as those are parsed into an abstract syntax tree as: | ||
[source, sql] | ||
-------------------------------------------------- | ||
(...((((a + b) + c) + d) + e) + f) + ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's really necessary to add these syntax tree parsing results. I do agree with the examples being here, but not necessarily with the syntax tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to answer to a question how this simple (to the eye) expression can cause stack overflow
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you'd have to explain what are the nodes of the tree (adding the brackets doesn't imply a tree imo) and that the parsing goes in depth on each level, then comes back and each step adds one more call to the calling stack. I think it's a matter of audience here and how much explanation is needed. If one is familiar with a parsing tree, then ok, otherwise adding the brackets doesn't fully explain the cause of the stackoverflow. My 2c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my side, I think the documentation part needs some rework.
[[large-parsing-trees]] | ||
=== Large queries may throw `ParsingExpection` | ||
|
||
{es-sql} uses https://www.antlr.org/[`ANTLR`] to create https://en.wikipedia.org/wiki/Abstract_syntax_tree[abstract syntax trees] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the audience is comprised of users, I would eliminate the technical stuff from the intro - in fact the whole sentence and rework it as "Extremely large queries can consume too much memory during parsing in which case the engine will abort parsing and throw an error - in these cases, consider reducing the query to a smaller size by potentially splitting it apart".
P.S. Technically ANTLR creates a parse tree (not an AST).
`ParsingException` with a descriptive error message. Below you can find some examples of queries that could trigger | ||
this. | ||
|
||
- Large arithmetic operator expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the examples of large trees would help - these are suppose to be corner-cases and I would expect it to be obvious (thousands of entries). Offering such a predominant section in the docs, makes it look like a common occurrence which is not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine run elasticsearch-ci/default-distro |
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: #41471 (cherry picked from commit 1559a8e)
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: #41471 (cherry picked from commit 1559a8e)
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: #41471 (cherry picked from commit 1559a8e)
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: #41471
* elastic/master: (50 commits) Cleanup versioned deprecations in analysis (elastic#41560) Allow unknown task time in QueueResizingEsTPE (elastic#41810) SQL: Remove CircuitBreaker from parser (elastic#41835) [DOCS] Fix callouts for dataframe APIs (elastic#41904) Handle serialization exceptions during publication (elastic#41781) Correct spelling of MockLogAppender.PatternSeenEventExpectation (elastic#41893) Update TLS ciphers and protocols for JDK 11 (elastic#41808) Remove Harmful Exists Check from BlobStoreFormat (elastic#41898) Fix fractional seconds for strict_date_optional_time (elastic#41871) Remove op.name configuration setting (elastic#41445) Reject port ranges in `discovery.seed_hosts` (elastic#41404) [ML-DataFrame] migrate to PageParams for get and stats, move PageParams into core (elastic#41851) Reenable RareClusterStateIT Mapping Propagation Tests (elastic#41884) [DOCS] Rewrite `exists` query docs (elastic#41868) Revert "Mute MinimumMasterNodesIT.testThreeNodesNoMasterBlock()" [DOCS] Fix typo referring to multi search API Provide names for all artifact repositories (elastic#41857) Move InternalAggregations to Writeable (elastic#41841) Fix compilation after incorrect merge Unmute TestClustersPluginIT.testMultiNode (elastic#41340) ...
The CircuitBreaker was introduced as means of preventing a `StackOverflowException` during the build of the AST by the parser. The ANTLR4 grammar causes a weird behaviour for a Parser Listener. The `enterEveryRule()` method is often called with a different parsing context than the respective `exitEveryRule()`. This makes it difficult to keep track of the tree's depth, and a custom Map was used as an attempt of matching the contextes as they are encounter during `enter` and during `exit` of the rules. This approach had 2 important drawbacks: 1. It's hard to maintain this custom Map as the grammar changes. 2. The CircuitBreaker could often lead to false positives which caused valid queries to return an Exception and prevent them from executing. So, this removes completely the CircuitBreaker which is replaced be a simple handling of the `StackOverflowException` Fixes: elastic#41471
The CircuitBreaker was introduced as means of preventing a
StackOverflowException
during the build of the AST by the parser.The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The
enterEveryRule()
method is often called with a different parsingcontext than the respective
exitEveryRule()
. This makes it difficultto keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during
enter
and during
exit
of the rules.This approach had 2 important drawbacks:
valid queries to return an Exception and prevent them from executing.
So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the
StackOverflowException
Fixes: #41471