forked from rapidsai/raft
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add random subsampling for IVF methods (rapidsai#2077)
While building IVF-Flat or IVF-PQ indices we usually subsample the dataset to create a smaller training set for k-means clustering. Until now this subsampling was done with a fixed stride, this PR changes it to random subsampling. The input is always randomized, even if all the vectors of the dataset are used. Random sampling adds an overhead. The overhead is proportional to the training set size. If dataset is on host, then this overhead can be partially or completely masked by H2D transfer. The overhead is small compared to k-means training. To completely overlap random sampling of the data with H2D copies, we utilize OpenMP parallelization to increase the effective bandwidth for gathering the data. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2077
- Loading branch information
Showing
5 changed files
with
171 additions
and
122 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.