-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge efficient filtering from feature branch #588
Conversation
* Add initial support for filtering Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
* Adding serialization/deserialization for filter field in Lucene knn query Signed-off-by: Martin Gaievski <[email protected]>
* Simplify min cluster version lookup Signed-off-by: Martin Gaievski <[email protected]>
* Refactor codec related classes, create KNNCodecVersion abstraction Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #588 +/- ##
============================================
+ Coverage 83.91% 84.51% +0.60%
- Complexity 1033 1054 +21
============================================
Files 148 149 +1
Lines 4233 4301 +68
Branches 373 382 +9
============================================
+ Hits 3552 3635 +83
+ Misses 507 489 -18
- Partials 174 177 +3
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
return this.clusterService.state().getNodes().getMinNodeVersion(); | ||
} catch (Exception exception) { | ||
log.error( | ||
String.format("Failed to get cluster minimum node version, returning current node version %s instead.", Version.CURRENT), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reasoning here to return Current as opposed to propagating the error up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking on keeping system running, the only reason I can imagine for this.clusterService.state().getNodes().getMinNodeVersion(); call to fail is related to network/transport issues that are stoping one or more nodes from submitting their version info to the cluster state. As we're not changing system state in knn query workflow and we're doing this check for all knn queries even without filter field, I think it's better to assume current version rather than failing.
} | ||
} | ||
|
||
protected String createKnnIndexMappingWithLuceneField(final String fieldName, int dimension) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, let me switch to existing method
validateSearchKNNIndexFailed(testIndex, new KNNQueryBuilder(TEST_FIELD, queryVector, K, TERM_QUERY), K); | ||
break; | ||
case MIXED: | ||
validateSearchKNNIndexFailed(testIndex, new KNNQueryBuilder(TEST_FIELD, queryVector, K, TERM_QUERY), K); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when we start upgrading from 2.4 to 2.5 or 3.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll need to disable this test for higher versions similarly to what we're doing for some other IT, this will work for cases when previous version doesn't have filtering and next does have it
request.addParameter("search_type", "query_then_fetch"); | ||
request.setJsonEntity(Strings.toString(builder)); | ||
|
||
expectThrows(ResponseException.class, () -> client().performRequest(request)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this fails with another unrelated exception? i.e. are we sure this will work in a valid case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to get same exception type for a different cause. Let me add one more assert for the error message, I think that should be specific to our case
*/ | ||
@NoArgsConstructor(access = AccessLevel.PRIVATE) | ||
@Log4j2 | ||
public class KNNClusterContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me this is a utility or a helper. Context objects from what I have seen in OpenSearch usually have state associated with them. Is there a better name for this class? Maybe like KNNClusterUtility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we can change/extend it in case we have more functionality or state in future. I like KNNClusterUtility, once small doubt is that in existing k-nn codebase we do have prefix as "Util", so I'll go with KNNClusterUtil
2608ef5
to
60a27bf
Compare
Signed-off-by: Martin Gaievski <[email protected]>
60a27bf
to
a2b92b1
Compare
@@ -101,8 +112,11 @@ public KNNQueryBuilder(StreamInput in) throws IOException { | |||
fieldName = in.readString(); | |||
vector = in.readFloatArray(); | |||
k = in.readInt(); | |||
if (isClusterOnOrAfterMinRequiredVersion()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a comment somewhere describing what we are doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, there isn't one, let me add it
Signed-off-by: Martin Gaievski <[email protected]>
* Adding efficient filtering Signed-off-by: Martin Gaievski <[email protected]> (cherry picked from commit f332ccb)
* Merge efficient filtering from feature branch (#588) * Adding efficient filtering Signed-off-by: Martin Gaievski <[email protected]> (cherry picked from commit f332ccb)
Description
Merge from feature branch to main as we got a signoff from PM team.
PRs included in this change:
Issues Resolved
#376
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.