Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Random Ball Cover Algorithm for 2D Haversine/Euclidean #213

Merged
merged 66 commits into from
Sep 23, 2021

Conversation

cjnolet
Copy link
Member

@cjnolet cjnolet commented Apr 26, 2021

This PR is a proof of concept to use the triangle inequality to prune the tree of exhaustive distance computations into something smaller, such as on the order of where c is called an expansion constant, based on the dimensionality.

This should (hopefully) be able to benefit both sparse and dense k-nearest neighbors and all algorithms that use them, hopefully providing a significant speedup for our sparse semirings primitive when only the k-nearest neighbors are desired.

The goal here is to construct a tree out of the random ball cover algorithm such that we can utilize it in algorithms which would otherwise be able to make efficient use of a ball tree. However, there are additional challenges to this algorithm on the GPU, such as being able to batch the tree lookups.

@cjnolet cjnolet requested review from divyegala and a team as code owners April 26, 2021 15:11
@github-actions github-actions bot added the cpp label Apr 26, 2021
@cjnolet cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 26, 2021
@cjnolet cjnolet changed the title [WIP] Reducing distance computations for brute-force KNN [WIP] PoC for Random Ball Cover Tree Algorithm Jun 15, 2021
@cjnolet cjnolet requested a review from a team as a code owner June 16, 2021 18:03
@cjnolet cjnolet changed the base branch from branch-21.06 to branch-21.08 June 16, 2021 20:45
@github-actions github-actions bot removed the gpuCI label Jun 18, 2021
@cjnolet cjnolet changed the title [WIP] PoC for Random Ball Cover Tree Algorithm [WIP] PoC for Random Ball Cover Algorithm Jun 18, 2021
@cjnolet
Copy link
Member Author

cjnolet commented Jun 18, 2021

Tentative plan for the random ball cover PoC:

  • 1. Build PoC for Haversine w/ n_cols hardcoded to 2 or 3 and verify perf
  • 2. If 1 is successful, abstract n_cols and k to any size

@ajschmidt8
Copy link
Member

Removing ops-codeowners from the required reviews since it doesn't seem there are any file changes that we're responsible for. Feel free to add us back if necessary.

@ajschmidt8 ajschmidt8 removed the request for review from a team June 18, 2021 20:06
@cjnolet cjnolet changed the title [REVIEW] Random Ball Cover Algorithm [REVIEW] Random Ball Cover Algorithm for 2D Haversine/Euclidean Sep 7, 2021
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review any of the faiss parts because that's outside my understanding. Rest of the code looks great, very minor suggestions. Thanks!

cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
Copy link
Contributor

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cjnolet
Copy link
Member Author

cjnolet commented Sep 17, 2021

rerun tests

Copy link
Contributor

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few items in-line.

cpp/include/raft/spatial/knn/ball_cover.hpp Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/ball_cover.hpp Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/ball_cover_common.h Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/test/spatial/spatial_data.h Show resolved Hide resolved
Copy link
Contributor

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks great! I'm pre-approving, but I noticed that cpp/test/spatial/ball_cover.cu, is the only file where we still have fixed width types without a corresponding include. Can we get a quick #include <cstdint> and s/uint32_t/std::uint32_t/g?

@cjnolet
Copy link
Member Author

cjnolet commented Sep 23, 2021

rerun tests

@cjnolet
Copy link
Member Author

cjnolet commented Sep 23, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 9b4adb1 into rapidsai:branch-21.10 Sep 23, 2021
rapids-bot bot pushed a commit to rapidsai/cuml that referenced this pull request Sep 29, 2021
This PR integrates the [random ball cover PoC](rapidsai/raft#213) into cuml's brute-force knn for executing the random ball cover algorithm for haversine distance.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4003
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
This PR integrates the [random ball cover PoC](rapidsai/raft#213) into cuml's brute-force knn for executing the random ball cover algorithm for haversine distance.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#4003
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants