forked from rapidsai/raft
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve analysis experience for ANN benchmarks (rapidsai#2139)
A few improvements to the benchmark executables aimed at improving the profiling/analysis experience using tools like Nsight Systems and possibly reducing benchmark overheads. - Reduce the number of spawned streams: 1. Invert the stream-passing API: the GPU wrappers implement `AnnGPU` interface to provide the synchronization stream; `search` and `build` no longer take the stream as the argument, which simplifies CPU-only implementations. 2. Create a global pool of streams, which are not deleted between benchmark cases. The algo wrapper may opt in to use these. As a result `nsys` timeline is much less cluttered. 3. `cuda_timer` does not create a new stream now, but pushes the synchronization events directly to the streams provided by `AnnGPU`. This slightly reduces the number of CUDA driver calls and thus reduces the mutex congestion in the throughput mode. - ~~Make raft algorithms use local memory workspace resources per-thread. This removes the mutex congestion that happened when many threads in the throughput mode tried to use rmm's pool allocator at the same time.~~ This one is removed/postponed because it could provoke OOM errors due to memory pools conflicting for the device memory. - Add fp16 (half) support to the benchmark executables (currently, only IVF-PQ and CAGRA support dataset/queries in fp16) - Add a `progress_barrier` RAII struct, so that the search throughput benchmark is more robust - doesn't deadlock on exceptions. - Add a `rmm::mr::failure_callback_resource_adaptor` to wrap OOM erros with `raft::exception`. This essentially adds the backtrace to the reported errors to make debugging easier. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai#2139
- Loading branch information
Showing
21 changed files
with
583 additions
and
406 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.