Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a null pointer check to fix index_prefix query #2879

Merged
merged 2 commits into from
Apr 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,9 @@ public Query prefixQuery(String value, MultiTermQuery.RewriteMethod method, bool
}
Automaton automaton = Operations.concatenate(automata);
AutomatonQuery query = new AutomatonQuery(new Term(name(), value + "*"), automaton);
query.setRewriteMethod(method);
if (method != null) {
Copy link
Member

@saratvemulapalli saratvemulapalli Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach solves the problem but the real root cause is the check for min, max chars.

The right fix here is to not accept() the query if the len is less than min_chars.
This makes sure the accept fails at: Line-762 and doesnt really need to call: Line-765

Suggested change
if (method != null) {
boolean accept(int length) {
return length >= minChars && length <= maxChars;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't know we should not be accepting the value less than minChars. I will make that change and add tests as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also take a look at: dd540ef.
There might be other cases which might fail.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the support for minChars - 1 was specifically added. Would the fix in this PR then be the solution since it changes the behavior if we don't accept the minChars - 1 length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix the original commit put in lead to this problem. My hunch says this should be fixed.
Could you read through the commit and understand what other problems we will see?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commit, the operation for the index_prefix query for length = minChars-1 is remapped to a? from a* to make it less expensive. Also, looking at the current code base, I see that the setRewriteMethod is wrapped by a null check at other places: for example https://github.com/opensearch-project/OpenSearch/blame/3c5d997a765e24ffa32d35219fd5026cfb143a9d/server/src/main/java/org/opensearch/index/mapper/StringFieldType.java#L116.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vacha for this.
I take back my suggestion, as its probably not supporting the optimization in performance.
I agree with you that, verifying the method being null solves the problem while being efficient.

Could you add some tests and we should be good to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is query.setRewriteMethod(null) a valid thing by itself? Can the rewrite method be null? If not throw in an assert or something like that to prevent hiding other errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null looks to be valid, also @nknize opened an issue #2896 to remove these usages next since the method is deprecated.

query.setRewriteMethod(method);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query.setRewriteMethod is removed in Lucene 9.2. I've opened an issue to remove all references to the method and use CONSTANT_SCORE_REWRITE in places where null is passed in as the method parameter.

}
return new BooleanQuery.Builder().add(query, BooleanClause.Occur.SHOULD)
.add(new TermQuery(new Term(parentField.name(), value)), BooleanClause.Occur.SHOULD)
.build();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,17 @@ public void testIndexPrefixes() {
);

assertThat(q, equalTo(expected));

q = ft.prefixQuery("g", null, false, randomMockShardContext());
automaton = Operations.concatenate(Arrays.asList(Automata.makeChar('g'), Automata.makeAnyChar()));

expected = new ConstantScoreQuery(
new BooleanQuery.Builder().add(new AutomatonQuery(new Term("field._index_prefix", "g*"), automaton), BooleanClause.Occur.SHOULD)
.add(new TermQuery(new Term("field", "g")), BooleanClause.Occur.SHOULD)
.build()
);

assertThat(q, equalTo(expected));
}

public void testFetchSourceValue() throws IOException {
Expand Down