-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ivf-pq performance tweaks #926
ivf-pq performance tweaks #926
Conversation
1. Change the layout of pq_centers to facilitate coalesced access during search 2. Optimize the arithmetics in ivfpq_compute_score and ivfpq_compute_similarity_kernel
…imize ALU usage in the main kernel
… extra conversions when possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Artem, here is my second batch of comments, focusing on the search part. Overall it looks good, I really appreciate the developer notes you have added to explain the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my last batch of comments, focusing on the build part. Please improve the description about the code that maps the data into the new layout of pq_dataset, otherwise it looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thansk Artem for addressing the issues! The PR looks good to me!
Note, I've updated the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a look at the latest changes that add support for using dataset on the host. It would have been better to dedicate a separate PR for that, but otherwise it looks good to me.
This PR introduces alignment into the cluster sizes and thus the total index size exceeds `n_rows`.
…exactly the size of a cluster
e80a345
to
56d6d5b
Compare
Note, there was an actual bug in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@gpucibot merge |
A few optimizations to the
ivfpq_compute_similarity_kernel
:warp_sort_distributed
)pq_centers
to make loads coalescedpq_dataset
to make loads coalesced and vectorized