Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-introduce CAGRA template instantiations to reduce compile time #1443

Closed
Tracked by #1392
tfeher opened this issue Apr 20, 2023 · 0 comments · Fixed by #1650
Closed
Tracked by #1392

Re-introduce CAGRA template instantiations to reduce compile time #1443

tfeher opened this issue Apr 20, 2023 · 0 comments · Fixed by #1650
Assignees

Comments

@tfeher
Copy link
Contributor

tfeher commented Apr 20, 2023

No description provided.

@tfeher tfeher mentioned this issue Apr 20, 2023
16 tasks
@tfeher tfeher self-assigned this Apr 20, 2023
@tfeher tfeher changed the title Re-introduce template instantiations to reduce compile time (were removed here: 358c09c and here 4afb03e) Re-introduce CAGRA template instantiations to reduce compile time Apr 20, 2023
rapids-bot bot pushed a commit that referenced this issue Jul 19, 2023
Cagra was introduced header only in #1375. This PR adds a precompiled single- and multi-cta search kernels to libraft.so. 

The single- and multi-cta search kernels were moved to separate header files to make it easier to specify extern template instantiations for these. 

The macros for dispatching the kernels were replaced by functions. We define explicit instantiations for the top level dispatch functions. (This is in contrast to #1428 where the kernels themselves were instantiated, which resulted in a large number of parameter combinations that had to be explicitly spelled out.)

This PR fixes #1443.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1650
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

1 participant