-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi vector support for Faiss HNSW - approximate search only #1371
Multi vector support for Faiss HNSW - approximate search only #1371
Conversation
How we are handling the exact search use case here? I can only see that we have made changes for ANN search. But we also do exact search based on various conditions. |
Update: Missed to handle the exact search case. Will update the PR. Hmm. I thought that exact search has no issue with multi vector because it searches every data anyway. Let me verify it again though. |
a8e496d
to
f2b02af
Compare
@heemin32 Can you look into failing CI? |
Not related with my change. Will retry. |
f2b02af
to
009f5be
Compare
009f5be
to
5c8b2af
Compare
@heemin32 actually, this looks like it might be related: https://github.com/opensearch-project/k-NN/actions/runs/7414868454/job/20176854501?pr=1371 |
5c8b2af
to
af172f4
Compare
Oops. Unused code checked in. Removed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve, you may want to rebase branch with main to avoid flaky tests
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## feature/multi-vector #1371 +/- ##
==========================================================
+ Coverage 85.12% 85.18% +0.05%
- Complexity 1258 1261 +3
==========================================================
Files 163 163
Lines 5110 5115 +5
Branches 479 479
==========================================================
+ Hits 4350 4357 +7
+ Misses 555 554 -1
+ Partials 205 204 -1 ☔ View full report in Codecov by Sentry. |
7474efc
to
4d2e517
Compare
4d2e517
to
f1ae907
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try rebasing the feature branch and your forked repo with main branch as there are some conflicting changes being added to main branch.
Will do |
f1ae907
to
1ac3f6d
Compare
1ac3f6d
to
e2f2cb5
Compare
df6d1fa
to
e64f762
Compare
803e5b2
to
d1ea68a
Compare
d1ea68a
to
e156363
Compare
e156363
to
5f4cebe
Compare
Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]>
5f4cebe
to
4c05bd6
Compare
0abed23
into
opensearch-project:feature/multi-vector
Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]>
Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]>
* Add patch to support multi vector in faiss (#1358) Signed-off-by: Heemin Kim <[email protected]> * Initialize id_map as null (#1363) Signed-off-by: Heemin Kim <[email protected]> * Add support of multi vector in jni (#1364) Signed-off-by: Heemin Kim <[email protected]> * Multi vector support for Faiss HNSW (#1371) Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]> * Add data generation script for nested field (#1388) Signed-off-by: Heemin Kim <[email protected]> * Add perf test for nested field (#1394) Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Heemin Kim <[email protected]>
Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]>
* Add patch to support multi vector in faiss (opensearch-project#1358) Signed-off-by: Heemin Kim <[email protected]> * Initialize id_map as null (opensearch-project#1363) Signed-off-by: Heemin Kim <[email protected]> * Add support of multi vector in jni (opensearch-project#1364) Signed-off-by: Heemin Kim <[email protected]> * Multi vector support for Faiss HNSW (opensearch-project#1371) Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]> * Add data generation script for nested field (opensearch-project#1388) Signed-off-by: Heemin Kim <[email protected]> * Add perf test for nested field (opensearch-project#1394) Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Heemin Kim <[email protected]> (cherry picked from commit 709b448)
* Add patch to support multi vector in faiss (opensearch-project#1358) Signed-off-by: Heemin Kim <[email protected]> * Initialize id_map as null (opensearch-project#1363) Signed-off-by: Heemin Kim <[email protected]> * Add support of multi vector in jni (opensearch-project#1364) Signed-off-by: Heemin Kim <[email protected]> * Multi vector support for Faiss HNSW (opensearch-project#1371) Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]> * Add data generation script for nested field (opensearch-project#1388) Signed-off-by: Heemin Kim <[email protected]> * Add perf test for nested field (opensearch-project#1394) Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Heemin Kim <[email protected]> (cherry picked from commit 709b448)
* Add patch to support multi vector in faiss (#1358) Signed-off-by: Heemin Kim <[email protected]> * Initialize id_map as null (#1363) Signed-off-by: Heemin Kim <[email protected]> * Add support of multi vector in jni (#1364) Signed-off-by: Heemin Kim <[email protected]> * Multi vector support for Faiss HNSW (#1371) Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <[email protected]> * Add data generation script for nested field (#1388) Signed-off-by: Heemin Kim <[email protected]> * Add perf test for nested field (#1394) Signed-off-by: Heemin Kim <[email protected]> --------- Signed-off-by: Heemin Kim <[email protected]> (cherry picked from commit 709b448)
Description
Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields.
Note: This is only for approximate search only. Exact search case will be handled in a follow up PR.
Issues Resolved
#1065
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.