Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set analyzer to regex query string search #3967

Merged
merged 11 commits into from
Aug 15, 2022

Conversation

yyyogev
Copy link
Contributor

@yyyogev yyyogev commented Jul 21, 2022

Description

Set default analyzer to query string search with regex

Issues Resolved

Resolves #3578

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@yyyogev yyyogev requested review from a team and reta as code owners July 21, 2022 10:13
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

dblock
dblock previously requested changes Jul 21, 2022
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below.

I am not familiar with this part of the code. Why does this work?

return super.getWildcardQuery(currentFieldType.name(), termStr);
if (currentFieldType != null) {
// return newUnmappedFieldQuery(field);
setAnalyzer(forceAnalyzer == null ? queryBuilder.context.getSearchAnalyzer(currentFieldType) : forceAnalyzer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we do this in multiple places maybe add a private getSearchAnalyzer(field) method?

if/else would probably look cleaner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: yogev mets <[email protected]>
@yyyogev
Copy link
Contributor Author

yyyogev commented Jul 24, 2022

Refer to the issue associated with this PR, we discussed it all there

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: yogev mets <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

if (forceAnalyzer != null && (analyzeWildcard || currentFieldType.getTextSearchInfo().isTokenized())) {
setAnalyzer(forceAnalyzer);
return super.getWildcardQuery(currentFieldType.name(), termStr);
if (currentFieldType != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ... did we lost if (currentFieldType == null) { condition?

Copy link
Contributor Author

@yyyogev yyyogev Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it will be cleaner to check if (currentFieldType != null) now that we always set the analyzer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is unmapped field case, we probably should return newUnmappedFieldQuery ?

Signed-off-by: yogev mets <[email protected]>
@yyyogev
Copy link
Contributor Author

yyyogev commented Jul 25, 2022

See below.

I am not familiar with this part of the code. Why does this work?

@dblock currently, when using regexp query, we only set the analyzer if it's forced (explicitly specify an analyzer in the search request). the reason is because of the wildcard queries ES introduced, but since wildcard queries are not being supported in OS, it's redundant now and only harm the regexp queries while searching for uppercase letters.

The change I made removes the condition to check if the force analyzer is specified and set the analyzer regardless - either the forced or by the currentFieldType

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@yyyogev
Copy link
Contributor Author

yyyogev commented Jul 27, 2022

@reta @dblock I'm not sure why the gradle check is failing now.. can you take a look?

@reta
Copy link
Collaborator

reta commented Jul 27, 2022

@reta @dblock I'm not sure why the gradle check is failing now.. can you take a look?

@yyyogev seems to be issue(s) with build pipeline:

/var/jenkins/workspace/gradle-check@tmp/durable-ea4df87d/script.sh: 41: docker-compose: not found

@peterzhuamazon any hints?

@peterzhuamazon
Copy link
Member

@reta @dblock I'm not sure why the gradle check is failing now.. can you take a look?

@yyyogev seems to be issue(s) with build pipeline:

/var/jenkins/workspace/gradle-check@tmp/durable-ea4df87d/script.sh: 41: docker-compose: not found

@peterzhuamazon any hints?

Hi sorry for the late reply guys.
We had a new deployment in Jenkins and missed some lines.

And we have a fix now to be reviewed.

Please be patient on this as it requires a new deployment.

Sorry and thanks.
Peter

@peterzhuamazon
Copy link
Member

@reta @dblock I'm not sure why the gradle check is failing now.. can you take a look?

@yyyogev seems to be issue(s) with build pipeline:

/var/jenkins/workspace/gradle-check@tmp/durable-ea4df87d/script.sh: 41: docker-compose: not found

@peterzhuamazon any hints?

Hi sorry for the late reply guys. We had a new deployment in Jenkins and missed some lines.

And we have a fix now to be reviewed.

Please be patient on this as it requires a new deployment.

Sorry and thanks. Peter

fixed now.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@AmiStrn
Copy link
Contributor

AmiStrn commented Aug 1, 2022

@yyyogev Looks like this passed the tests 🎸
@reta @dblock can you review the changes made?

@yyyogev yyyogev requested a review from dblock August 1, 2022 16:59
@AmiStrn
Copy link
Contributor

AmiStrn commented Aug 3, 2022

@dblock - just pinging as this is 2 days since the first approval. can you please check this as well?

@AmiStrn
Copy link
Contributor

AmiStrn commented Aug 8, 2022

@reta is there anyone else who can review this one?

@reta
Copy link
Collaborator

reta commented Aug 8, 2022

@reta is there anyone else who can review this one?

@nknize @andrross could you please take a look? the change seems to be reasonable

@AmiStrn
Copy link
Contributor

AmiStrn commented Aug 14, 2022

Weekly ping :/

@andrross
Copy link
Member

Weekly ping :/

So sorry for the delay. @nknize any concerns here?

Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Clean fix. No concerns at all. Sorry for the delay here and thanks for fixing this! 🎉

@nknize nknize dismissed dblock’s stale review August 15, 2022 20:07

Changes were addressed...

@nknize nknize merged commit ea4cfcc into opensearch-project:main Aug 15, 2022
@AmiStrn
Copy link
Contributor

AmiStrn commented Aug 15, 2022

Nice! Clean fix. No concerns at all. Sorry for the delay here and thanks for fixing this! 🎉

Great news:) Thanks @nknize and @reta !
We would like to backport this since it was in all versions as a bug since 1.0

How would we get this in version 1.x as a bug fix? (We could create the pr's, just asking for the versions required to backport to)

@yyyogev - FYI

@reta
Copy link
Collaborator

reta commented Aug 15, 2022

@AmiStrn a bit late for backport label, but basically we need to cherry-pick the commit to 2.x / 2.2 / 2.1 / 2.0 / 1.x / 1.3 manually (I think).

@andrross andrross added the backport 2.x Backport to 2.x branch label Aug 15, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 15, 2022
Sets analyzer to regex query string search

Signed-off-by: yyyogev <[email protected]>
(cherry picked from commit ea4cfcc)
@andrross andrross added the backport 1.3 Backport to 1.3 branch label Aug 15, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 15, 2022
Sets analyzer to regex query string search

Signed-off-by: yyyogev <[email protected]>
(cherry picked from commit ea4cfcc)
@andrross
Copy link
Member

@reta The labels work even after the PR has been merged :)

@reta
Copy link
Collaborator

reta commented Aug 16, 2022

@reta The labels work even after the PR has been merged :)

Thanks @andrross !

@andrross
Copy link
Member

I think 1.3 is the end of the line for 1.x releases, so backporting there will ensure the bug fix gets in to any future 1.3 release if one happens. 2.x branch will ensure the next minor release gets the fix. I think that should be sufficient? Please let me know if I'm wrong here.

@nknize
Copy link
Collaborator

nknize commented Aug 16, 2022

FYI: I opened #4226 until the feature is back ported since the test is consistently failing bwc CI.

dblock pushed a commit that referenced this pull request Sep 6, 2022
Sets analyzer to regex query string search

Signed-off-by: yyyogev <[email protected]>
(cherry picked from commit ea4cfcc)

Signed-off-by: Daniel (dB.) Doubrovkine <[email protected]>
dblock pushed a commit that referenced this pull request Sep 6, 2022
Sets analyzer to regex query string search

Signed-off-by: yyyogev <[email protected]>
(cherry picked from commit ea4cfcc)

Signed-off-by: Daniel (dB.) Doubrovkine <[email protected]>

Signed-off-by: Daniel (dB.) Doubrovkine <[email protected]>
Co-authored-by: Yogev Mets <[email protected]>
nknize pushed a commit that referenced this pull request Oct 12, 2022
Sets analyzer to regex query string search

Signed-off-by: yogev mets <[email protected]>
(cherry picked from commit ea4cfcc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.3 Backport to 1.3 branch backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Uppercase Regex in String queries search
8 participants