Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RBC eps-neighborhood query performance #2211

Merged
merged 13 commits into from
Mar 11, 2024

Conversation

mfoerste4
Copy link
Collaborator

@mfoerste4 mfoerste4 commented Mar 5, 2024

This PR significantly improves performance of epsilon neighborhood search via RBC.

Scope:

  • the rbc index was modified to contain a reordered dataset allowing for better access patterns
  • all kernels for dense, sparse and hybrid have been improved

Optimizations include:

  • improve pruning of complete landmarks by pre-fetching distances and minimizing overhead for skipping
  • specialized 2D and 3D kernels, allowing for register per-fetch of query points
  • improve inner loop iterating landmark neighborhood
    • pruning of points within selected landmark neighborhood using triangle inequality
    • reverse iterate landmark neighborhood to allow complete processing stop once subsequent points cannot be reached
    • removal of shared memory atomics in favor of warp voting
  • general prevention of branches within a warp (at least to some extent)

CC @cjnolet , @tfeher

@mfoerste4 mfoerste4 requested a review from a team as a code owner March 5, 2024 15:19
@mfoerste4 mfoerste4 self-assigned this Mar 5, 2024
@github-actions github-actions bot added the cpp label Mar 5, 2024
@mfoerste4 mfoerste4 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change and removed cpp labels Mar 5, 2024
@github-actions github-actions bot added the cpp label Mar 5, 2024
@mfoerste4
Copy link
Collaborator Author

mfoerste4 commented Mar 7, 2024

Thanks @Nyrio for the review. i have added a couple of minor updates that slightly improve perf further. Still could not find a way to remove the latency issue of the distance-compute-loads though.

I have not applied the changes to all kernel variants yet until we decide to settle on it.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the PR, it is great to have a large speedup for the eps-neighborhood kernels. Overall the PR looks good to me.

Please add a few more details about the optimizations into the description, eg:

  • pruning of points within selected landmark neighborhood using triangle inequality
  • specialized 2D and 3D kernels

Many thanks to @Nyrio for helping with the optimizations and the review!

@tfeher
Copy link
Contributor

tfeher commented Mar 11, 2024

/merge

@rapids-bot rapids-bot bot merged commit dddb05a into rapidsai:branch-24.04 Mar 11, 2024
71 checks passed
@mfoerste4 mfoerste4 deleted the rbc_eps_nn_performance branch March 12, 2024 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants