Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ivf-pq performance tweaks #926

Merged
merged 34 commits into from
Nov 17, 2022

Conversation

achirkin
Copy link
Contributor

@achirkin achirkin commented Oct 19, 2022

A few optimizations to the ivfpq_compute_similarity_kernel:

  • Overhauled the way shmem/L1 carveout is selected
  • Introduced the block size selection logic based on the shmem/L1 split, occupancy, and the estimated cluster probes co-residency
  • Ported a new warp-sort module (warp_sort_distributed)
  • Transposed pq_centers to make loads coalesced
  • Changed layout of pq_dataset to make loads coalesced and vectorized
  • Optimized the loops to minimize ALU load

@github-actions github-actions bot added the cpp label Oct 19, 2022
@achirkin achirkin added 2 - In Progress Currenty a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change cpp and removed cpp labels Oct 19, 2022
@achirkin achirkin added 3 - Ready for Review and removed 2 - In Progress Currenty a work in progress labels Oct 27, 2022
@achirkin achirkin marked this pull request as ready for review October 27, 2022 06:20
@achirkin achirkin requested a review from a team as a code owner October 27, 2022 06:20
tfeher added a commit to tfeher/raft that referenced this pull request Nov 7, 2022
@achirkin achirkin requested a review from tfeher November 10, 2022 17:08
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Artem, here is my second batch of comments, focusing on the search part. Overall it looks good, I really appreciate the developer notes you have added to explain the changes.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my last batch of comments, focusing on the build part. Please improve the description about the code that maps the data into the new layout of pq_dataset, otherwise it looks good.

tfeher added a commit to tfeher/raft that referenced this pull request Nov 11, 2022
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thansk Artem for addressing the issues! The PR looks good to me!

@achirkin
Copy link
Contributor Author

Note, I've updated the ivf_pq::build code to allow host-side input and training data. This is needed for training on datasets that do not fit into device memory.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at the latest changes that add support for using dataset on the host. It would have been better to dedicate a separate PR for that, but otherwise it looks good to me.

This PR introduces alignment into the cluster sizes and thus the total
index size exceeds `n_rows`.
@achirkin achirkin requested a review from a team as a code owner November 16, 2022 06:36
@achirkin achirkin force-pushed the enh-ivf-pq-perf-tweaks branch from e80a345 to 56d6d5b Compare November 17, 2022 10:37
@achirkin
Copy link
Contributor Author

Note, there was an actual bug in ivf_pq_search.cuh even before this PR, which was only exposed now due to the changed pq_data layout, and only in @tfeher's python tests. In the two commits since the last review, I've added extra test cases and checks for this and similar bugs (incorrect indices returned in the search post-processing step).

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cjnolet
Copy link
Member

cjnolet commented Nov 17, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit e06b156 into rapidsai:branch-22.12 Nov 17, 2022
@achirkin achirkin deleted the enh-ivf-pq-perf-tweaks branch November 17, 2022 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review breaking Breaking change cpp improvement Improvement / enhancement to an existing function python
Projects
Development

Successfully merging this pull request may close these issues.

3 participants