CAGRA pad dataset for 128bit vectorized load #1505

tfeher · 2023-05-10T12:56:22Z

This PR adds padding to the dataset (if necessary) to make reading any of its rows compatible with 128bit vectorized loads. This change also enables handling arbitrary number of input features (before this PR each row had to be at least 64bit aligned, which constrained the acceptable number of input features).

Fixes #1458.

With this change, it is sufficient to keep a single "load type" specialization for the search kernels, which shall cut the binary size by half (#1459).

tfeher · 2023-05-10T13:00:06Z

Todo:

store information about padding (ld and dim param) as strided mdspan
use padding information in search kernels

tfeher · 2023-05-10T13:01:26Z

Marked as breaking change because the PR

removes the load_bit_length search parameter.
changes index.dataset() layout to layout_strided

tfeher · 2023-05-15T22:12:09Z

There is still a bug with the new tests, otherwise it is ready for review.

enp1s0 · 2023-05-16T05:25:02Z

The solution of the compilation error is:

https://github.com/tfeher/raft/blob/fb175783a6cb6b9d0e56f0c18bd84c269fce9bb1/cpp/include/raft/neighbors/detail/cagra/search_multi_cta.cuh#L231

231:      dataset_size,
+         dataset_ld,
232:      result_buffer_size,

enp1s0

The changes look good to me.

One comment:
It would be better to support a padded dataset in the argument of cagra::sort_knn_graph. What do you think?

tfeher · 2023-06-06T22:21:25Z

I have fixed a bug in the serialization routine, and added more tests. @enp1s0 could you have a quick look again at the changes?

It would be better to support a padded dataset in the argument of cagra::sort_knn_graph. What do you think?

I think it is a good suggestion, and it might be also useful to allow the constructor of index to accept padded dataset (i.e. strided mdspan). I would prefer to make these changes in a follow up PR.

enp1s0

Thank you, @tfeher, for fixing it and adding tests. The code looks good to me.

cjnolet

LGTM. Just a couple questions and a suggestion for a future improvement overall. Since CAGRA is still experimental, we have some leeway for API changes.

cjnolet · 2023-06-09T21:12:37Z

cpp/include/raft/neighbors/detail/cagra/cagra_serialize.cuh

+                                  cudaMemcpyDefault,
+                                  resource::get_cuda_stream(res)));
+  resource::sync_stream(res);
+  serialize_mdspan(res, os, host_dataset.view());


Mostly a side question, but are we still planning to remove the dataset from the serialization in a future change?

cjnolet · 2023-06-09T21:18:53Z

cpp/include/raft/neighbors/cagra_types.hpp

+                 resource::get_cuda_stream(res));
+    } else {
+      // copy with padding
+      RAFT_CUDA_TRY(cudaMemsetAsync(


If this is going to be a more common practice, I wonder if we should consider centralizing this somewhere eventually. Probably doesn't need to be done yet, or even in this PR, though.

cjnolet · 2023-06-09T21:20:14Z

cpp/include/raft/neighbors/detail/cagra/search_multi_cta.cuh

@@ -569,7 +546,7 @@ struct search : public search_plan_impl<DATA_T, INDEX_T, DISTANCE_T> {
  ~search() {}

  void operator()(raft::resources const& res,
-                  raft::device_matrix_view<const DATA_T, INDEX_T, row_major> dataset,
+                  raft::device_matrix_view<const DATA_T, INDEX_T, layout_stride> dataset,


Would there be any benefit to using a padded layout here or having an overload for it in the public API just to simplify the conversion?

cjnolet · 2023-06-09T21:29:33Z

/merge

tfeher added enhancement New feature or request breaking Breaking change labels May 10, 2023

tfeher mentioned this pull request May 10, 2023

Cagra index construction without copying device mdarrays #1494

Merged

cjnolet assigned tfeher May 10, 2023

tfeher force-pushed the cagra_pad_dataset branch from 7e06a0c to fb17578 Compare May 15, 2023 07:57

github-actions bot added the cpp label May 15, 2023

tfeher marked this pull request as ready for review May 15, 2023 08:10

tfeher requested a review from a team as a code owner May 15, 2023 08:10

tfeher added improvement Improvement / enhancement to an existing function and removed enhancement New feature or request labels May 15, 2023

tfeher requested a review from enp1s0 May 15, 2023 22:12

tfeher mentioned this pull request May 15, 2023

CAGRA template instantiations #1428

Closed

enp1s0 reviewed May 19, 2023

View reviewed changes

Pad dataset to allow 128bit vectorized load

21e2d08

tfeher force-pushed the cagra_pad_dataset branch from fb17578 to 21e2d08 Compare June 6, 2023 20:36

tfeher requested review from a team as code owners June 6, 2023 20:36

github-actions bot added ci CMake python labels Jun 6, 2023

tfeher changed the base branch from branch-23.06 to branch-23.08 June 6, 2023 20:36

github-actions bot removed python CMake ci labels Jun 6, 2023

Fix style

08f4dcb

tfeher force-pushed the cagra_pad_dataset branch from da1fb53 to 08f4dcb Compare June 6, 2023 20:59

Merge branch 'branch-23.08' into cagra_pad_dataset

0808401

enp1s0 approved these changes Jun 7, 2023

View reviewed changes

tfeher added 2 commits June 7, 2023 20:41

Merge branch 'branch-23.08' into cagra_pad_dataset

b914a3b

Merge branch 'branch-23.08' into cagra_pad_dataset

f826ecb

ajschmidt8 removed the request for review from a team June 8, 2023 16:29

cjnolet approved these changes Jun 9, 2023

View reviewed changes

rapids-bot bot merged commit 6ec78e9 into rapidsai:branch-23.08 Jun 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAGRA pad dataset for 128bit vectorized load #1505

CAGRA pad dataset for 128bit vectorized load #1505

tfeher commented May 10, 2023 •

edited

Loading

tfeher commented May 10, 2023 •

edited

Loading

tfeher commented May 10, 2023 •

edited

Loading

tfeher commented May 15, 2023

enp1s0 commented May 16, 2023

enp1s0 left a comment

tfeher commented Jun 6, 2023

enp1s0 left a comment

cjnolet left a comment •

edited

Loading

cjnolet Jun 9, 2023

cjnolet Jun 9, 2023

cjnolet Jun 9, 2023

cjnolet commented Jun 9, 2023

CAGRA pad dataset for 128bit vectorized load #1505

CAGRA pad dataset for 128bit vectorized load #1505

Conversation

tfeher commented May 10, 2023 • edited Loading

tfeher commented May 10, 2023 • edited Loading

tfeher commented May 10, 2023 • edited Loading

tfeher commented May 15, 2023

enp1s0 commented May 16, 2023

enp1s0 left a comment

Choose a reason for hiding this comment

tfeher commented Jun 6, 2023

enp1s0 left a comment

Choose a reason for hiding this comment

cjnolet left a comment • edited Loading

Choose a reason for hiding this comment

cjnolet Jun 9, 2023

Choose a reason for hiding this comment

cjnolet Jun 9, 2023

Choose a reason for hiding this comment

cjnolet Jun 9, 2023

Choose a reason for hiding this comment

cjnolet commented Jun 9, 2023

tfeher commented May 10, 2023 •

edited

Loading

tfeher commented May 10, 2023 •

edited

Loading

tfeher commented May 10, 2023 •

edited

Loading

cjnolet left a comment •

edited

Loading