You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
During IVF-Flat search a query vector is compared to all the vectors from n_probes clusters, and we have n_queries * n_probes query-probe pairs. For large batch search, when n_queries * n_probes > n_clusters then there will be clusters that are compared to more than one query vector.
The execution time of IVF-Flat is determined by the time to load the clusters from memory. Currently the query-probe pairs are sorted according to query index. To improve memory load time, we can sort the query-probe pairs according to the probe id (cluster label).
Describe the solution you'd like
Sort the query-probe pairs during fine search for better cache reuse. This is already implemented for IVF-PQ, and the same can be applied for IVF-Flat as well:
Additional context
In IVF-Flat search we typically have 0.1-1% of the clusters searched, therefore this optimization is expected to help with batch size that is correspondingly large (hundreds or thousends of query vectors). We have a helper utility to calculate the expected number of times a cluster is loaded. This can be used to decide whether to sort the input data or not.
Is your feature request related to a problem? Please describe.
During IVF-Flat search a query vector is compared to all the vectors from
n_probes
clusters, and we haven_queries * n_probes
query-probe pairs. For large batch search, whenn_queries * n_probes > n_clusters
then there will be clusters that are compared to more than one query vector.The execution time of IVF-Flat is determined by the time to load the clusters from memory. Currently the query-probe pairs are sorted according to query index. To improve memory load time, we can sort the query-probe pairs according to the probe id (cluster label).
Describe the solution you'd like
Sort the query-probe pairs during fine search for better cache reuse. This is already implemented for IVF-PQ, and the same can be applied for IVF-Flat as well:
raft/cpp/include/raft/neighbors/detail/ivf_pq_search.cuh
Lines 529 to 569 in 7342980
Additional context
In IVF-Flat search we typically have 0.1-1% of the clusters searched, therefore this optimization is expected to help with batch size that is correspondingly large (hundreds or thousends of query vectors). We have a helper utility to calculate the expected number of times a cluster is loaded. This can be used to decide whether to sort the input data or not.
raft/cpp/include/raft/neighbors/detail/ivf_pq_search.cuh
Line 396 in 7342980
The text was updated successfully, but these errors were encountered: