CAGRA #1375

tfeher · 2023-03-27T11:51:46Z

This PR adds CAGRA, a graph based method for nearest neighbor search.

tfeher · 2023-03-27T12:39:45Z

Todo:

refactor search params
memory allocation using RMM in search
~~memory allocations in prune~~ moved to follow-up
extend test coverage
raft logging is missing from a few places
add serialization
~~revise multi GPU build/prune~~ moved to [FEA] CAGRA follow up #1392

cpp/src/neighbors/cagra/make_search_cores.sh

tfeher · 2023-04-05T12:38:56Z

The latest commits limits the binary size increase to 125 MiB. ninja_log.

There is an issue with the specializations, they are not used for build / search, resulting in extra long compile time for these. I am considering to remove the high level specializations, and only leave for the lower level kernels. We can re-introduce these once we ensure that the RAFT_COMPILED flag is visible during compilation of raft_lib.

The ninja log is really very useful for keeping track of these issues!

divyegala

No specializations for this version?

cpp/include/raft/neighbors/cagra.cuh

divyegala · 2023-04-05T21:16:53Z

cpp/include/raft/neighbors/cagra_types.hpp

+    raft::copy(dataset_.data_handle(), dataset.data_handle(), dataset.size(), res.get_stream());
+    raft::copy(graph_.data_handle(), knn_graph.data_handle(), knn_graph.size(), res.get_stream());


Are these copies necessary if the dataset is already on device?

No, it is not strictly necessary. one could keep a reference for the data in that case. The reason for copying the data is to ensure that the index owns all the data needed for search, similarly how the other ann indices (ivf_flat, ivf_pq) owns the data.

We should note, that the index structure is more complex for ivf methods, which makes it necessary for them to copy the data into their specialized structure. Is contrast, the cagra index is a simple data structure (just a wrapper for two matrices), therefore we could even think about constructing an index on the fly from the two matrices.

Yes, I think we can go about this a couple of ways:

Always copy, as is currently happening

Copy just the view, which would be like maintaining a reference to the underlying data

Construct the index on the fly if it's sufficiently cheap

Let the user pick which option to use from above. For example, if the user can guarantee that the data will stay in scope then I think they might go with 2

Could you please add this to your follow-on tasks as well? We can chat more later about how we want to go about this.

Thanks for the suggestions! I have added an item to to #1392

divyegala · 2023-04-05T21:17:41Z

cpp/include/raft/util/cache_util.cuh

@@ -50,7 +50,7 @@ __global__ void get_vecs(
  if (tid < n_vec * n) {
    size_t out_col   = tid / n_vec;  // col idx
    size_t cache_col = cache_idx[out_col];
-    if (cache_idx[out_col] >= 0) {
+    if (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {


Suggested change

if (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {

if constexpr (!std::is_signed<idx_t>::value || cache_idx[out_col] >= 0) {

Unfortunately, that does not compile because cache_idx[out_col] is not compile time constant.

Ah. You can write it as two if-conditions then, right? if constexpr as the outer if and then the inner condition which is a plain if. Anyway, this isn't blocking by any means. I will leave it up to you to decide what way you want to do it.

benfred

I'm super excited about this change - but there is a ton of new code here, which makes this a bit hard to review.

Given that this is an experimental feature (and in an experimental namespace) - I think we can probably merge this in so other people can start trying this out. @divyegala @cjnolet what do you think?

I left some minor comments on this below,

cpp/include/raft/neighbors/detail/cagra/topk_for_cagra/topk.h

cpp/include/raft/neighbors/cagra_types.hpp

cpp/test/CMakeLists.txt

cpp/include/raft/neighbors/detail/cagra/hashmap.hpp

cpp/include/raft/neighbors/detail/cagra/bitonic.hpp

tfeher · 2023-04-05T21:46:09Z

No specializations for this version?

In an offline discussion @cjnolet recommended to leave out all the specializaitions. Even with limited float support the specializations would add 125 MiB (+20%) to libraft.so. Let's hope it won't timeout while compiling the test.

tfeher

Thanks @divyegala and @benfred for the review! I have addressed the issues.

Indeed there is a lot of code in the implementation details. That part will be improved in the follow up, to re-use already existing utilities from raft, improve structure and add developer docs.

cpp/include/raft/neighbors/cagra.cuh

cpp/include/raft/neighbors/cagra_types.hpp

tfeher · 2023-04-05T22:31:59Z

cpp/include/raft/neighbors/cagra_types.hpp

+    raft::copy(dataset_.data_handle(), dataset.data_handle(), dataset.size(), res.get_stream());
+    raft::copy(graph_.data_handle(), knn_graph.data_handle(), knn_graph.size(), res.get_stream());


No, it is not strictly necessary. one could keep a reference for the data in that case. The reason for copying the data is to ensure that the index owns all the data needed for search, similarly how the other ann indices (ivf_flat, ivf_pq) owns the data.

We should note, that the index structure is more complex for ivf methods, which makes it necessary for them to copy the data into their specialized structure. Is contrast, the cagra index is a simple data structure (just a wrapper for two matrices), therefore we could even think about constructing an index on the fly from the two matrices.

cpp/include/raft/neighbors/detail/cagra/bitonic.hpp

cpp/include/raft/neighbors/detail/cagra/hashmap.hpp

cpp/test/CMakeLists.txt

benfred

thanks for the changes @tfeher!

divyegala

Thanks @tfeher for this PR!

I agree with @benfred that we can merge this for now and do more iterations on reviews for implementation details as this feature makes it out of experimental. So for now, I just tried to review the public facing code.

benfred · 2023-04-06T05:25:56Z

/merge

This PR adds CAGRA, a graph based method for nearest neighbor search. Authors: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) - Ben Frederickson (https://github.com/benfred) Approvers: - Ben Frederickson (https://github.com/benfred) - Divye Gala (https://github.com/divyegala) URL: rapidsai#1375

Cagra was introduced header only in #1375. This PR adds a precompiled single- and multi-cta search kernels to libraft.so. The single- and multi-cta search kernels were moved to separate header files to make it easier to specify extern template instantiations for these. The macros for dispatching the kernels were replaced by functions. We define explicit instantiations for the top level dispatch functions. (This is in contrast to #1428 where the kernels themselves were instantiated, which resulted in a large number of parameter combinations that had to be explicitly spelled out.) This PR fixes #1443. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1650

github-actions bot added CMake cpp labels Mar 27, 2023

Add CAGRA, initial experimental version

8793adb

tfeher force-pushed the cagra_experimental branch from 38ce3f6 to 8793adb Compare March 27, 2023 12:12

cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 27, 2023

cjnolet assigned tfeher Mar 27, 2023

tfeher added 3 commits March 27, 2023 16:44

Restructuring search params in progress

2748758

replacing printf statements with RAFT_LOG_DEBUG

c51cc7a

remove topk.cu

25d35ad

tfeher commented Mar 27, 2023

View reviewed changes

cpp/src/neighbors/cagra/make_search_cores.sh Outdated Show resolved Hide resolved

tfeher added 13 commits March 28, 2023 18:27

Fix logging, revert some of the search_params refactoring

9adb9b0

adding specializations

9dd0d46

corrections

d844e78

Enabled test for distance values, test team size

7c7819c

added int8 and uint8 test and specializations

7991a56

correct copyright year for test files

eb46fcf

temporarily disabling int8 & uint8 tests

60dfb3d

Adding new search_plan

1be9514

single_cta params factored out

7e8ba3f

Single cta plan creation works

e7cd010

all search configs added to plan

72d2dff

refactored compiles

0e30822

Search dispatch refatored

48a5161

tfeher marked this pull request as ready for review March 31, 2023 07:51

tfeher requested review from a team as code owners March 31, 2023 07:51

tfeher added 2 commits March 31, 2023 09:55

Remove old search dispatch

af96473

Remove old search specialization

b9639a3

tfeher added 4 commits April 5, 2023 00:19

Reorder search params

64a898e

Fix style errors

e3639a6

Fix style

3983147

Fix typo

c52dd33

tfeher added 3 commits April 5, 2023 21:11

Add resources arg to prune

7a249b5

Remove all cagra specializations

4afb03e

Remove unused cagra specialization header

93c470a

divyegala reviewed Apr 5, 2023

View reviewed changes

benfred reviewed Apr 5, 2023

View reviewed changes

tfeher added 4 commits April 6, 2023 00:05

Make refine_rate arg std::optional

f962f22

Replace hashmap_mode string with enum

ccbe925

Only keep test file for float data type

619666c

Add constxpr

1df4859

tfeher commented Apr 5, 2023

View reviewed changes

Remove constexpr

c9be192

benfred approved these changes Apr 6, 2023

View reviewed changes

divyegala approved these changes Apr 6, 2023

View reviewed changes

Merge branch 'branch-23.04' into cagra_experimental

7f73aa9

rapids-bot bot merged commit a6e0e6b into rapidsai:branch-23.04 Apr 6, 2023

This was referenced Apr 18, 2023

CAGRA template instantiations #1428

Closed

Remove dataset from CAGRA index #1435

Closed

This was referenced May 1, 2023

[FEA] CAGRA avoid unnecessary device memory copies of dataset / knn graph #1479

Closed

Cagra index construction without copying device mdarrays #1494

Merged

tfeher mentioned this pull request Jul 17, 2023

Cagra template instantiations #1650

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAGRA #1375

CAGRA #1375

tfeher commented Mar 27, 2023

tfeher commented Mar 27, 2023 •

edited

Loading

tfeher commented Apr 5, 2023 •

edited

Loading

divyegala left a comment

divyegala Apr 5, 2023

tfeher Apr 5, 2023

divyegala Apr 6, 2023

tfeher Apr 6, 2023

divyegala Apr 5, 2023

tfeher Apr 5, 2023 •

edited

Loading

divyegala Apr 6, 2023 •

edited

Loading

benfred left a comment

tfeher commented Apr 5, 2023

tfeher left a comment

tfeher Apr 5, 2023

benfred left a comment

divyegala left a comment

benfred commented Apr 6, 2023

		raft::copy(dataset_.data_handle(), dataset.data_handle(), dataset.size(), res.get_stream());
		raft::copy(graph_.data_handle(), knn_graph.data_handle(), knn_graph.size(), res.get_stream());

	if (!std::is_signed<idx_t>::value \|\| cache_idx[out_col] >= 0) {
	if constexpr (!std::is_signed<idx_t>::value \|\| cache_idx[out_col] >= 0) {

CAGRA #1375

CAGRA #1375

Conversation

tfeher commented Mar 27, 2023

tfeher commented Mar 27, 2023 • edited Loading

tfeher commented Apr 5, 2023 • edited Loading

divyegala left a comment

Choose a reason for hiding this comment

divyegala Apr 5, 2023

Choose a reason for hiding this comment

tfeher Apr 5, 2023

Choose a reason for hiding this comment

divyegala Apr 6, 2023

Choose a reason for hiding this comment

tfeher Apr 6, 2023

Choose a reason for hiding this comment

divyegala Apr 5, 2023

Choose a reason for hiding this comment

tfeher Apr 5, 2023 • edited Loading

Choose a reason for hiding this comment

divyegala Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

benfred left a comment

Choose a reason for hiding this comment

tfeher commented Apr 5, 2023

tfeher left a comment

Choose a reason for hiding this comment

tfeher Apr 5, 2023

Choose a reason for hiding this comment

benfred left a comment

Choose a reason for hiding this comment

divyegala left a comment

Choose a reason for hiding this comment

benfred commented Apr 6, 2023

tfeher commented Mar 27, 2023 •

edited

Loading

tfeher commented Apr 5, 2023 •

edited

Loading

tfeher Apr 5, 2023 •

edited

Loading

divyegala Apr 6, 2023 •

edited

Loading