-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] fix case sensitivity for wildcard queries #5462
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
19dafad
[Bug] fix case sensitivity for wildcard queries
nknize bb08f52
update changelog
nknize 848ef8d
fix test error message failure
nknize 1ec159a
update SearchQueryIT to properly test wildcard query case sensitivity
nknize 7e73510
update javadocs for new wildcard methods
nknize File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nknize I think we still do have a conflict here: it makes sense to apply the
caseInsensitive
hint when user does not specify the analyzer to be used (in this casegetTextSearchInfo().getSearchAnalyzer()
is the default one), but I think we should not do that if theanalyzer
or/andanalyze_wildcard
are provided[1] https://www.elastic.co/guide/en/elasticsearch/reference/7.10/query-dsl-query-string-query.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's why QueryStringQueryParser calls
normalizedWildcardQuery
method. It ignores thecaseInsensitive
parameter for query string queries.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was caused the confusion to me: we already have 2 variants of
wildcardQuery
method + now a new one callednormalizedWildcardQuery
, could we dropnormalizedWildcardQuery
since this is just delegate towildcardQuery
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We sure can.. it's just syntactic sugar. I just thought it was a more descriptive call in the
QueryStringQueryParser
than callingwildcardQuery(value, method, false, true, context)
, but I can throw javadocs there to describe the logic.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah wait.. nvm..I did it this way because the delegation for normalizedWildcardQuery is specific to
StringFieldType
only... It's not the same forkeyword
andconstant
fields.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is non issue: since you have this new method
only
StringFieldType
could implement it, others won't. Another option I was thinking, may be we could embed this intoTextSearchInfo
/MappedFieldType
as a property, since as you mentioned it is specific toStringFieldType
only?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is QueryStringQueryParser calls the generic MappedFieldType.wildcardQuery method. I can eliminate the normalized version of this method call by bumping the new method up the class hierarchy but I'd prefer not do that since
normalizeIfAnalyzed
is only relevant to the StringFIeldType. (and I don't like the idea of having to keep track of what thecaseInsensitive
andnormalizeIfAnalyzed
boolean logic means if we add another field type that doesn't care).I think this API is cleaner? But I should add javadocs to describe that
normalizeWildcardQuery
normalizes to lower case whereaswildcardQuery
optionally lowers case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadocs would help for sure, I don't think my suggestion (to have overloaded
wildcardQuery
) is any better than newnormalizedWildcardQuery
- removes confusion in one place but hurts another one. What do you think about enrichingTextSearchInfo
forStringFieldType
? (than we don't need thisnormalizeIfAnalyzed
argument at all) I believe this could be exactly the right place to do so:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is
QueryStringQueryParser
needs to change the normalization behavior at search runtime and TextSearchInfo is defined at index creation time for encapsulating Lucene fieldType parameters (e.g., norms, offsets, positions). This could be hacked around by adding a setter to dynamically change TextSearchInfo based on the query but that changes the purpose ofTextSearchInfo
and I think adds confusion on when to do this and for what type of queries. It might be worth exploring but I think in a follow up enhancement issue since the surface area impacted is larger than the scope of this bug fix? This may also highlight thatTextSearchInfo
isn't the best classname.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see thanks @nknize