Skip to content

Commit

Permalink
ivf-pq integration: hotfixes (#891)
Browse files Browse the repository at this point in the history
A second set of hotfixes for ivf-pq:

  1. `max_samples` is calculated from `cluster_offsets` to reduce the number of arguments and register pressure when it's not needed. The calculation on the kernel side was wrong and not matching the calculation on the host side (missing roundUp/128), which led to incorrect offsets inside the filled distance array.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #891
  • Loading branch information
achirkin authored Oct 5, 2022
1 parent 55953f3 commit 05de844
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions cpp/include/raft/spatial/knn/detail/ivf_pq_search.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,7 @@ __launch_bounds__(1024) __global__
out_indices = _out_indices + topk * (probe_ix + (n_probes * query_ix));
} else {
// Store all calculated distances to out_scores
auto max_samples = cluster_offsets[n_probes];
auto max_samples = Pow2<128>::roundUp(cluster_offsets[n_probes]);
out_scores = _out_scores + max_samples * query_ix;
}
uint32_t label = cluster_labels[n_probes * query_ix + probe_ix];
Expand Down Expand Up @@ -741,7 +741,7 @@ __launch_bounds__(1024) __global__
__syncthreads();
} else {
// fill in the rest of the out_scores with dummy values
uint32_t max_samples = uint32_t(cluster_offsets[n_probes]);
uint32_t max_samples = uint32_t(Pow2<128>::roundUp(cluster_offsets[n_probes]));
if (probe_ix + 1 == n_probes) {
for (uint32_t i = threadIdx.x + sample_offset + n_samples; i < max_samples;
i += blockDim.x) {
Expand Down

0 comments on commit 05de844

Please sign in to comment.