Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter option for query type #88

Merged
merged 2 commits into from
Dec 21, 2022

Conversation

jmazanec15
Copy link
Member

Description

Adds filter option for query type. Filtering support was introduced in the k-NN plugin in 2.4. This change breaks backwards compatibility with OpenSearch 2.4, however, given that 2.4 is experimental, this is okay. Backwards
compatibility issues will only arise during mixed cluster upgrade.

Query will look like:

     * {
     *  "VECTOR_FIELD": {
     *    "query_text": "string",
     *    "model_id": "string",
     *    "k": int,
     *    "name": "string", (optional)
     *    "boost": float (optional),
     *    "filter": map (optional)
     *  }
     * }

Issues Resolved

#86

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Adds filter option for query type. Filtering support was introduced in
the k-NN plugin in 2.4. Breaks backwards compatibility with OpenSearch
2.4, however, given that 2.4 is experimental, this is okay. Backwards
  compatibility issues will only arise during mixed cluster upgrade.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 added Enhancements Increases software capabilities beyond original client specifications backport 2.x Label will add auto workflow to backport PR to 2.x branch labels Dec 20, 2022
@jmazanec15 jmazanec15 requested a review from a team December 20, 2022 00:29
@jmazanec15 jmazanec15 changed the title Add filter option for query type. Add filter option for query type Dec 20, 2022
@SuppressWarnings("unchecked")
protected int getHitCount(Map<String, Object> searchResponseAsMap) {
Map<String, Object> hits1map = (Map<String, Object>) searchResponseAsMap.get("hits");
List<Object> hits2List = (List<Object>) hits1map.get("hits");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that var be hits1List?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will update.

@@ -52,6 +62,8 @@ public class NeuralQueryBuilderTests extends OpenSearchTestCase {
private static final String QUERY_NAME = "queryName";
private static final Supplier<float[]> TEST_VECTOR_SUPPLIER = () -> new float[10];

private static final QueryBuilder TEST_FILTER = new MatchAllQueryBuilder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice query, I missed it when writing filtering for knn

@navneet1v
Copy link
Collaborator

navneet1v commented Dec 20, 2022

Question:
What is the behaviour of the neural query if we move the filter out of the neural clause? This might not be related to this change, but I want to understand the behavior to see if there is some case that we might be missing.
@jmazanec15

@jmazanec15
Copy link
Member Author

Hi @navneet1v could you give an example of what you mean by out of neural clause?

Signed-off-by: John Mazanec <[email protected]>
@navneet1v
Copy link
Collaborator

Hi @navneet1v could you give an example of what you mean by out of neural clause?

For a case like this, what will happen:

{
    "query": {
        "bool": {
            "neural": {
                "body_knn": {
                    "query_text": "How much does it cost to pour concrete for a deck",
                    "model_id": "someModelId",
                    "filter": [ 
                        { "term":  { "status": "someValue1" }}
                    ]
                }
            },
            "filter": [ 
                { "term":  { "status": "someValue2" }}
            ]
        }
    }
}

@jmazanec15
Copy link
Member Author

jmazanec15 commented Dec 21, 2022

@navneet1v Ah I see for the boolean clause. In this case, no results would be returned. The neural query would only return results with:

{ "term":  { "status": "someValue1" }}

However, after this is returned in bool clause, the bool filter would be run after this, selecting only from:

{ "term":  { "status": "someValue2" }}

@jmazanec15 jmazanec15 merged commit de551e2 into opensearch-project:main Dec 21, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 21, 2022
Adds filter option for query type. Filtering support was introduced in
the k-NN plugin in 2.4. Breaks backwards compatibility with OpenSearch
2.4, however, given that 2.4 is experimental, this is okay. Backwards
  compatibility issues will only arise during mixed cluster upgrade.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit de551e2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch Enhancements Increases software capabilities beyond original client specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants