-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a limit for graph phrase query expansion #34031
Conversation
Today query parsers throw TooManyClauses exception when a query creates too many clauses. However graph phrase queries do not respect this limit. This change adds a protection against crazy expansions that can happen when building a graph phrase query. This is a temporary copy of the fix available in https://issues.apache.org/jira/browse/LUCENE-8479 but not merged yet. This logic will be removed when we integrate the Lucene patch in a future release.
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Term[] terms = graph.getTerms(field, start); | ||
assert terms.length > 0; | ||
if (terms.length >= BooleanQuery.getMaxClauseCount()) { | ||
throw new BooleanQuery.TooManyClauses(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the former check is triggered by the test, but this code path isn't. I haven't checked all the test setup logic, but would it be difficult to include a case that also triggers this code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 5bc8a40 to test both paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimczi LGTM, I left two minor questions mostly for my own education, would be nice to hear your opinion those
TokenStream ts = it.next(); | ||
SpanQuery q = createSpanQuery(ts, field); | ||
if (q != null) { | ||
if (queries.size() >= BooleanQuery.getMaxClauseCount()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my own education, and it is certainly super minor: when reading this part I was wondering if it would make sense to get the maxClauseCount limit once at the beginning of this method since its unlikely to change to avoid method calls in each iteration). Maybe Java does some clever optimizations to avoid this though and the effect most likely negligible.
Today query parsers throw TooManyClauses exception when a query creates too many clauses. However graph phrase queries do not respect this limit. This change adds a protection against crazy expansions that can happen when building a graph phrase query. This is a temporary copy of the fix available in https://issues.apache.org/jira/browse/LUCENE-8479 but not merged yet. This logic will be removed when we integrate the Lucene patch in a future release.
Today query parsers throw TooManyClauses exception when a query creates too many clauses. However graph phrase queries do not respect this limit. This change adds a protection against crazy expansions that can happen when building a graph phrase query. This is a temporary copy of the fix available in https://issues.apache.org/jira/browse/LUCENE-8479 but not merged yet. This logic will be removed when we integrate the Lucene patch in a future release.
Backport of elastic#34031 for 5.6.
Today query parsers throw TooManyClauses exception when a query creates too many clauses. However graph phrase queries do not respect this limit. This change adds a protection against crazy expansions that can happen when building a graph phrase query. This is a temporary copy of the fix available in https://issues.apache.org/jira/browse/LUCENE-8479 but not merged yet. This logic will be removed when we integrate the Lucene patch in a future release.
Today query parsers throw TooManyClauses exception when a query creates
too many clauses. However graph phrase queries do not respect this limit.
This change adds a protection against crazy expansions that can happen when
building a graph phrase query. This is a temporary copy of the fix available
in https://issues.apache.org/jira/browse/LUCENE-8479 but not merged yet.
This logic will be removed when we integrate the Lucene patch in a future
release.