Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAGRA #1375

Merged
merged 47 commits into from
Apr 6, 2023
Merged

CAGRA #1375

merged 47 commits into from
Apr 6, 2023

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Mar 27, 2023

This PR adds CAGRA, a graph based method for nearest neighbor search.

@tfeher tfeher force-pushed the cagra_experimental branch from 38ce3f6 to 8793adb Compare March 27, 2023 12:12
@cjnolet cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 27, 2023
@tfeher
Copy link
Contributor Author

tfeher commented Mar 27, 2023

Todo:

  • refactor search params
  • memory allocation using RMM in search
  • memory allocations in prune moved to follow-up
  • extend test coverage
  • raft logging is missing from a few places
  • add serialization
  • revise multi GPU build/prune moved to [FEA] CAGRA follow up #1392

@tfeher tfeher marked this pull request as ready for review March 31, 2023 07:51
@tfeher tfeher requested review from a team as code owners March 31, 2023 07:51
@tfeher
Copy link
Contributor Author

tfeher commented Apr 5, 2023

The latest commits limits the binary size increase to 125 MiB. ninja_log.

There is an issue with the specializations, they are not used for build / search, resulting in extra long compile time for these. I am considering to remove the high level specializations, and only leave for the lower level kernels. We can re-introduce these once we ensure that the RAFT_COMPILED flag is visible during compilation of raft_lib.

The ninja log is really very useful for keeping track of these issues!

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specializations for this version?

cpp/include/raft/neighbors/cagra.cuh Show resolved Hide resolved
cpp/include/raft/neighbors/cagra.cuh Outdated Show resolved Hide resolved
Comment on lines +184 to +185
raft::copy(dataset_.data_handle(), dataset.data_handle(), dataset.size(), res.get_stream());
raft::copy(graph_.data_handle(), knn_graph.data_handle(), knn_graph.size(), res.get_stream());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these copies necessary if the dataset is already on device?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not strictly necessary. one could keep a reference for the data in that case. The reason for copying the data is to ensure that the index owns all the data needed for search, similarly how the other ann indices (ivf_flat, ivf_pq) owns the data.

We should note, that the index structure is more complex for ivf methods, which makes it necessary for them to copy the data into their specialized structure. Is contrast, the cagra index is a simple data structure (just a wrapper for two matrices), therefore we could even think about constructing an index on the fly from the two matrices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can go about this a couple of ways:

  1. Always copy, as is currently happening
  2. Copy just the view, which would be like maintaining a reference to the underlying data
  3. Construct the index on the fly if it's sufficiently cheap
  4. Let the user pick which option to use from above. For example, if the user can guarantee that the data will stay in scope then I think they might go with 2

Could you please add this to your follow-on tasks as well? We can chat more later about how we want to go about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions! I have added an item to to #1392

@@ -50,7 +50,7 @@ __global__ void get_vecs(
if (tid < n_vec * n) {
size_t out_col = tid / n_vec; // col idx
size_t cache_col = cache_idx[out_col];
if (cache_idx[out_col] >= 0) {
if (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {
if constexpr (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {

Copy link
Contributor Author

@tfeher tfeher Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that does not compile because cache_idx[out_col] is not compile time constant.

Copy link
Member

@divyegala divyegala Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. You can write it as two if-conditions then, right? if constexpr as the outer if and then the inner condition which is a plain if. Anyway, this isn't blocking by any means. I will leave it up to you to decide what way you want to do it.

Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm super excited about this change - but there is a ton of new code here, which makes this a bit hard to review.

Given that this is an experimental feature (and in an experimental namespace) - I think we can probably merge this in so other people can start trying this out. @divyegala @cjnolet what do you think?

I left some minor comments on this below,

cpp/include/raft/neighbors/cagra_types.hpp Outdated Show resolved Hide resolved
cpp/test/CMakeLists.txt Show resolved Hide resolved
@tfeher
Copy link
Contributor Author

tfeher commented Apr 5, 2023

No specializations for this version?

In an offline discussion @cjnolet recommended to leave out all the specializaitions. Even with limited float support the specializations would add 125 MiB (+20%) to libraft.so. Let's hope it won't timeout while compiling the test.

Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divyegala and @benfred for the review! I have addressed the issues.

Indeed there is a lot of code in the implementation details. That part will be improved in the follow up, to re-use already existing utilities from raft, improve structure and add developer docs.

cpp/include/raft/neighbors/cagra.cuh Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/cagra_types.hpp Outdated Show resolved Hide resolved
Comment on lines +184 to +185
raft::copy(dataset_.data_handle(), dataset.data_handle(), dataset.size(), res.get_stream());
raft::copy(graph_.data_handle(), knn_graph.data_handle(), knn_graph.size(), res.get_stream());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not strictly necessary. one could keep a reference for the data in that case. The reason for copying the data is to ensure that the index owns all the data needed for search, similarly how the other ann indices (ivf_flat, ivf_pq) owns the data.

We should note, that the index structure is more complex for ivf methods, which makes it necessary for them to copy the data into their specialized structure. Is contrast, the cagra index is a simple data structure (just a wrapper for two matrices), therefore we could even think about constructing an index on the fly from the two matrices.

cpp/test/CMakeLists.txt Show resolved Hide resolved
Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes @tfeher!

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tfeher for this PR!

I agree with @benfred that we can merge this for now and do more iterations on reviews for implementation details as this feature makes it out of experimental. So for now, I just tried to review the public facing code.

@benfred
Copy link
Member

benfred commented Apr 6, 2023

/merge

@rapids-bot rapids-bot bot merged commit a6e0e6b into rapidsai:branch-23.04 Apr 6, 2023
ahendriksen pushed a commit to ahendriksen/raft that referenced this pull request Apr 14, 2023
This PR adds CAGRA, a graph based method for nearest neighbor search.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Ben Frederickson (https://github.com/benfred)
  - Divye Gala (https://github.com/divyegala)

URL: rapidsai#1375
ahendriksen pushed a commit to ahendriksen/raft that referenced this pull request Apr 18, 2023
This PR adds CAGRA, a graph based method for nearest neighbor search.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Ben Frederickson (https://github.com/benfred)
  - Divye Gala (https://github.com/divyegala)

URL: rapidsai#1375
rapids-bot bot pushed a commit that referenced this pull request Jul 19, 2023
Cagra was introduced header only in #1375. This PR adds a precompiled single- and multi-cta search kernels to libraft.so. 

The single- and multi-cta search kernels were moved to separate header files to make it easier to specify extern template instantiations for these. 

The macros for dispatching the kernels were replaced by functions. We define explicit instantiations for the top level dispatch functions. (This is in contrast to #1428 where the kernels themselves were instantiated, which resulted in a large number of parameter combinations that had to be explicitly spelled out.)

This PR fixes #1443.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1650
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

5 participants