-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Random Ball Cover Algorithm for 2D Haversine/Euclidean #213
[REVIEW] Random Ball Cover Algorithm for 2D Haversine/Euclidean #213
Conversation
Tentative plan for the random ball cover PoC:
|
Removing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not review any of the faiss parts because that's outside my understanding. Rest of the code looks great, very minor suggestions. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
version and all supported configurations (all knn vs separate index/query)
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! A few items in-line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks great! I'm pre-approving, but I noticed that cpp/test/spatial/ball_cover.cu, is the only file where we still have fixed width types without a corresponding include. Can we get a quick #include <cstdint>
and s/uint32_t/std::uint32_t/g
?
rerun tests |
@gpucibot merge |
This PR integrates the [random ball cover PoC](rapidsai/raft#213) into cuml's brute-force knn for executing the random ball cover algorithm for haversine distance. Authors: - Corey J. Nolet (https://github.com/cjnolet) - Dante Gama Dessavre (https://github.com/dantegd) - Paul Taylor (https://github.com/trxcllnt) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: #4003
This PR integrates the [random ball cover PoC](rapidsai/raft#213) into cuml's brute-force knn for executing the random ball cover algorithm for haversine distance. Authors: - Corey J. Nolet (https://github.com/cjnolet) - Dante Gama Dessavre (https://github.com/dantegd) - Paul Taylor (https://github.com/trxcllnt) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4003
This PR is a proof of concept to use the triangle inequality to prune the tree of exhaustive distance computations into something smaller, such as on the order of where c is called an expansion constant, based on the dimensionality.
This should (hopefully) be able to benefit both sparse and dense k-nearest neighbors and all algorithms that use them, hopefully providing a significant speedup for our sparse semirings primitive when only the k-nearest neighbors are desired.
The goal here is to construct a tree out of the random ball cover algorithm such that we can utilize it in algorithms which would otherwise be able to make efficient use of a ball tree. However, there are additional challenges to this algorithm on the GPU, such as being able to batch the tree lookups.