Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Remove CircuitBreaker from parser #41835

Merged
merged 5 commits into from
May 7, 2019
Merged

Conversation

matriv
Copy link
Contributor

@matriv matriv commented May 6, 2019

The CircuitBreaker was introduced as means of preventing a
StackOverflowException during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The enterEveryRule() method is often called with a different parsing
context than the respective exitEveryRule(). This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during enter
and during exit of the rules.

This approach had 2 important drawbacks:

  1. It's hard to maintain this custom Map as the grammar changes.
  2. The CircuitBreaker could often lead to false positives which caused
    valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the StackOverflowException

Fixes: #41471

The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: elastic#41471
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one comment. It's not a definitive no-no, but I have doubts regarding the inclusion of the parsed tree in there.

as those are parsed into an abstract syntax tree as:
[source, sql]
--------------------------------------------------
(...((((a + b) + c) + d) + e) + f) + ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's really necessary to add these syntax tree parsing results. I do agree with the examples being here, but not necessarily with the syntax tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to answer to a question how this simple (to the eye) expression can cause stack overflow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, you'd have to explain what are the nodes of the tree (adding the brackets doesn't imply a tree imo) and that the parsing goes in depth on each level, then comes back and each step adds one more call to the calling stack. I think it's a matter of audience here and how much explanation is needed. If one is familiar with a parsing tree, then ok, otherwise adding the brackets doesn't fully explain the cause of the stackoverflow. My 2c.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, I think the documentation part needs some rework.

[[large-parsing-trees]]
=== Large queries may throw `ParsingExpection`

{es-sql} uses https://www.antlr.org/[`ANTLR`] to create https://en.wikipedia.org/wiki/Abstract_syntax_tree[abstract syntax trees]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the audience is comprised of users, I would eliminate the technical stuff from the intro - in fact the whole sentence and rework it as "Extremely large queries can consume too much memory during parsing in which case the engine will abort parsing and throw an error - in these cases, consider reducing the query to a smaller size by potentially splitting it apart".

P.S. Technically ANTLR creates a parse tree (not an AST).

`ParsingException` with a descriptive error message. Below you can find some examples of queries that could trigger
this.

- Large arithmetic operator expressions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the examples of large trees would help - these are suppose to be corner-cases and I would expect it to be obvious (thousands of entries). Offering such a predominant section in the docs, makes it look like a common occurrence which is not the case.

@matriv matriv requested a review from costin May 6, 2019 14:42
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matriv
Copy link
Contributor Author

matriv commented May 7, 2019

@elasticmachine run elasticsearch-ci/default-distro

@matriv matriv merged commit 1559a8e into elastic:master May 7, 2019
@matriv matriv deleted the fix-41471 branch May 7, 2019 20:09
matriv added a commit that referenced this pull request May 7, 2019
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: #41471
(cherry picked from commit 1559a8e)
matriv added a commit that referenced this pull request May 7, 2019
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: #41471
(cherry picked from commit 1559a8e)
matriv added a commit that referenced this pull request May 7, 2019
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: #41471
(cherry picked from commit 1559a8e)
jasontedor pushed a commit that referenced this pull request May 7, 2019
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: #41471
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 7, 2019
* elastic/master: (50 commits)
  Cleanup versioned deprecations in analysis (elastic#41560)
  Allow unknown task time in QueueResizingEsTPE (elastic#41810)
  SQL: Remove CircuitBreaker from parser (elastic#41835)
  [DOCS] Fix callouts for dataframe APIs (elastic#41904)
  Handle serialization exceptions during publication (elastic#41781)
  Correct spelling of MockLogAppender.PatternSeenEventExpectation (elastic#41893)
  Update TLS ciphers and protocols for JDK 11 (elastic#41808)
  Remove Harmful Exists Check from BlobStoreFormat (elastic#41898)
  Fix fractional seconds for strict_date_optional_time (elastic#41871)
  Remove op.name configuration setting (elastic#41445)
  Reject port ranges in `discovery.seed_hosts` (elastic#41404)
  [ML-DataFrame] migrate to PageParams for get and stats, move PageParams into core (elastic#41851)
  Reenable RareClusterStateIT Mapping Propagation Tests (elastic#41884)
  [DOCS] Rewrite `exists` query docs (elastic#41868)
  Revert "Mute MinimumMasterNodesIT.testThreeNodesNoMasterBlock()"
  [DOCS] Fix typo referring to multi search API
  Provide names for all artifact repositories (elastic#41857)
  Move InternalAggregations to Writeable (elastic#41841)
  Fix compilation after incorrect merge
  Unmute TestClustersPluginIT.testMultiNode (elastic#41340)
  ...
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: elastic#41471
@matriv matriv mentioned this pull request May 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: Replace Parser CircuitBreaker with catching the StackOverflow Exception
5 participants