Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector deduplication for HNSW #3140

Closed
wants to merge 1 commit into from

Conversation

heemin32
Copy link

@heemin32 heemin32 commented Nov 22, 2023

Add vector deduplication for HNSW

@heemin32 heemin32 changed the title Add vector deduplication Add vector deduplication for HNSW Nov 22, 2023
@heemin32 heemin32 force-pushed the dedupe branch 3 times, most recently from c552f96 to e1b3338 Compare November 22, 2023 20:54
@mdouze
Copy link
Contributor

mdouze commented Nov 23, 2023

Thanks for doing the effort of implementing this deduplication functionality.
However, as I said in #3087 I don't think it is of sufficient interest.
We are trying to limit the amount of code in Faiss because otherwise it is more work to maintain.

@heemin32
Copy link
Author

@mdouze Thanks for the comment. Could you tell how to decide if there are sufficient interest or not? OpenSearch uses faiss for knn search engine and there are many interest on document deduplication feature. In other words, what would make your mind change to accept this PR?

@heemin32 heemin32 closed this Jan 25, 2024
@cjnolet
Copy link
Contributor

cjnolet commented Jan 28, 2024

I don't want to take any attention away from FAISS here, but multi-valued keys have been requested several times in the RAFT / cuVS library. It's on our roadmap to hopefully implement sometime within the year. If the feature ends up performing well, we would be interested in contributing it to FAISS. At the very least, we could write up an example of how to accomplish this functionality using the Boolean filter function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants