-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Port CAGRA to RAFT #997
Comments
Can you please share a link to the paper? |
Hi @S-o-T, not paper yet but I can share a few initial benchmark results with you. Is this helpful? |
Thanks a lot! Looks promising indeed. Also, would be nice to understand costs for index building. I am wondering, if this approach can outperform brute-force matching of sift features, so the cost of building an index might be a problem. What is current ETA? |
@S-o-T, currently implementation uses nn-descent in the index building process so it should be very fast and competitive. I can see about getting benchmarks for index construction as well. Current ETA is going to be around 23.06 release at the earliest (June release) but more realistically it might be later before it's fully integrated. |
@cjnolet And by that time, the integration to faiss planned to be finished and this approach would be accessible via faiss api? |
@S-o-T, the timeframe for Cagra integration into FAISS is tentative at the moment, but yes it will be accessible eventually. We are more likely to see RAFT's IVF-FLAT/IVF-PQ fully integrated by 23.06/23.08 timeframe. |
Hi @cjnolet,impressive work! do we have performance number compared to cpu graph index in the similar dataset? |
@xiaofan-luan unfortunately none that I'm able to share quite yet. I can say we are finding speedup over HNSW for small batch performance. This (free) GTC session will provide more details and benchmarks against HNSW and GGNN. |
@S-o-T @xiaofan-luan I have been given the okay to share some benchmarks that compare CAGRA with HNSW and GGNN: @S-o-T as requested, here are some initial benchmarks of CAGRA index building times against HNSW and GGNN. It builds the indexes using |
The index building time of HNSW seems to be incredibly long compare to CPU version. But the search performance under larger batch is very impressive. |
@cjnolet Thanks for the slides, looks interesting! Currently, i am most interested in following scenario: metric = inner product, dim = 128, |database| = 10^4 (also, 2^16), |query| = 10^4, k = 2 (also, k = 1). Both indexing and query processing performance are important, but indexing might by amortized if query processing is fast. It would be interesting to compare with current implementation of faiss GpuFlat and raft's flat alternative (still not found a time to play with raft's nn primitives). Also, might be interesting to assess performance using various scalars (quantizations), such as f32, f16, u8. |
@S-o-T so it sounds like you are interested in fast ANN at the scale of 10's of thousands? Is this computation needed to be performed repeatedly or at somewhat interacitve/real-time speeds? |
Yes, that's my main interest currently.
Not quite, so giving up a lot of quality to get some "real-time" guarantee is not an option, but trading "some" quality to get substantial gains on latency/throughput might be interesting. |
CAGRA has officially been removed from experimental status and is ready for use! Please don't hesitate to provide us further feedback! |
Related PR: #1666 |
Cagra is a new state of the art graph-based nearest neighbors method which is currently showing great query performance for both large and small batch sizes. Adding a placeholder here to track it's port onto raft primitives.
The text was updated successfully, but these errors were encountered: