Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove specializations and split expensive headers #1415

Closed
wants to merge 91 commits into from
Closed
Show file tree
Hide file tree
Changes from 81 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
65ec2f1
Move headers to -inl path
ahendriksen Apr 14, 2023
51e8673
Add back empty headers
ahendriksen Apr 14, 2023
9e03fca
Disable warnings for clang compilation
ahendriksen Apr 3, 2023
3bace7b
Comment out omp to enable clang compilation
ahendriksen Apr 3, 2023
f0e74a2
cmake: Define RAFT_EXPLICIT_INCLUDE
ahendriksen Apr 3, 2023
a8f62fb
Split raft/core/logger
ahendriksen Apr 3, 2023
db53bde
Split raft/spatial/knn/detail/ball_cover/registers.cuh
ahendriksen Apr 4, 2023
4d26ca9
Split memory_pool, fused_l2_knn, coalesced_reduction, selection_faiss
ahendriksen Apr 14, 2023
2aebe4b
Split ivf_flat_search and interleaved_scan
ahendriksen Apr 6, 2023
16a828f
Pin `dask` and `distributed` for release (#1399)
galipremsagar Apr 6, 2023
66b8493
CAGRA (#1375)
tfeher Apr 6, 2023
5ff134a
Add select_k source files
ahendriksen Apr 6, 2023
698ef53
Split distance.cuh
ahendriksen Apr 11, 2023
50a8fe7
Clean up cmake
ahendriksen Apr 11, 2023
97fa949
Revert "Comment out omp to enable clang compilation"
ahendriksen Apr 11, 2023
ca1ed72
Fix tests
ahendriksen Apr 12, 2023
96894b2
Ensure uniformity of dispatch headers
ahendriksen Apr 12, 2023
2d9dc21
WIP: update docs
ahendriksen Apr 12, 2023
31def15
DOC
raydouglass Mar 23, 2023
e477dbe
Update rapids version
vyasr Mar 30, 2023
dd5cdd3
Update pylibraft version
vyasr Mar 30, 2023
d8c85e7
Run dfg
vyasr Mar 30, 2023
bdcbfcf
Fix dask versions in wheel build preinstallation
vyasr Mar 30, 2023
30b777c
Fix ucx-py pin in raft-dask recipe (#1396)
vyasr Mar 31, 2023
6ad9d95
Have consistent compile lines between BUILD_TESTS enabled or not (#1401)
robertmaynard Apr 7, 2023
c09993b
Remove uses-setup-env-vars (#1406)
vyasr Apr 10, 2023
8bd64a0
Split ivf-pq, fused l2 nn
ahendriksen Apr 12, 2023
1a5c588
Fix tests
ahendriksen Apr 13, 2023
56a4c4a
Move raft_runtime src files
ahendriksen Apr 13, 2023
cb80db6
Split ivf_flat
ahendriksen Apr 13, 2023
d528f4b
Split refine
ahendriksen Apr 13, 2023
06e1b6f
Remove unused fused l2 knn files
ahendriksen Apr 13, 2023
4a869e6
Split ball cover
ahendriksen Apr 13, 2023
89ac806
Remove unused files
ahendriksen Apr 13, 2023
bf6f203
Split brute force knn
ahendriksen Apr 13, 2023
aae0dfa
Remove deprecated specialization files
ahendriksen Apr 13, 2023
6196a2b
Remove remaining specializations
ahendriksen Apr 13, 2023
c72e70a
Sort src files in CMakeLists.txt
ahendriksen Apr 14, 2023
c37b73e
Document RAFT_EXPLICIT
ahendriksen Apr 14, 2023
75c076b
Undo custom RMM
ahendriksen Apr 14, 2023
e97e0bd
DOC
raydouglass Mar 23, 2023
bcadeae
Update pylibraft version
vyasr Mar 30, 2023
c767b87
Run dfg
vyasr Mar 30, 2023
30d09b1
Pin `dask` and `distributed` for release (#1399)
galipremsagar Apr 6, 2023
facb5d2
Have consistent compile lines between BUILD_TESTS enabled or not (#1401)
robertmaynard Apr 7, 2023
d3626f9
Generate build metrics report for test and benchmarks (#1414)
divyegala Apr 13, 2023
7097720
Fix IVF-PQ API to use `device_vector_view` (#1384)
lowener Apr 13, 2023
a7a46ca
Adding base header-only conda package without cuda math libs (#1386)
cjnolet Apr 13, 2023
38d276e
Fix style
ahendriksen Apr 14, 2023
b9ba602
Fix style
ahendriksen Apr 14, 2023
7ca2242
Remove greppable-id comments
ahendriksen Apr 14, 2023
00db48f
Add note to modify generating python script
ahendriksen Apr 14, 2023
ff1b955
Split selection_faiss source file
ahendriksen Apr 14, 2023
dfe860e
Update docs
ahendriksen Apr 14, 2023
1ee301e
Replace specialization with instantiation
ahendriksen Apr 14, 2023
4879514
Fix ivf benchmarks
ahendriksen Apr 14, 2023
7461085
Remove preprocessor logic from distance test
ahendriksen Apr 14, 2023
32fb40b
Fix benchmarks
ahendriksen Apr 14, 2023
7ff4fad
Add ALL_BENCH CMake target
ahendriksen Apr 14, 2023
5c03804
Fix non-standard instantiation
ahendriksen Apr 14, 2023
018910d
Fix docs build
ahendriksen Apr 17, 2023
8158fa6
Add back specialization headers
ahendriksen Apr 17, 2023
bfde580
Undo change to conda environment
ahendriksen Apr 17, 2023
858b46d
Revert ConfigureCUDA.cmake
ahendriksen Apr 17, 2023
651bdd7
Rename *-types.cuh to *_types.cuh
ahendriksen Apr 17, 2023
a90400d
Fix doxygen errors
ahendriksen Apr 17, 2023
076a145
Revert "Undo change to conda environment"
ahendriksen Apr 17, 2023
dffb552
Undo changes to python dependencies
ahendriksen Apr 17, 2023
070a385
Update cpp/include/raft/neighbors/ivf_flat-inl.cuh
ahendriksen Apr 18, 2023
35ed4df
Update cpp/include/raft/neighbors/ivf_flat-inl.cuh
ahendriksen Apr 18, 2023
65c1cba
Update cpp/include/raft/neighbors/ivf_flat-ext.cuh
ahendriksen Apr 18, 2023
3ea52b8
Address review re: distance API
ahendriksen Apr 18, 2023
f8788ef
Use ARC V2 self-hosted runners for GPU jobs (#1410)
jjacobelli Apr 17, 2023
5c179e6
Fix is_min_close (#1419)
benfred Apr 17, 2023
edb1c5c
IVF-PQ: manipulating individual lists (#1298)
achirkin Apr 17, 2023
5e84daa
Add python bindings for matrix::select_k (#1422)
benfred Apr 17, 2023
9906aba
Fixup yaml and toml files
ahendriksen Apr 18, 2023
1299b23
Fixup ivpq and ivf_flat files
ahendriksen Apr 18, 2023
868ada3
Merge remote-tracking branch 'rapids/branch-23.06' into enh-inl-headers
ahendriksen Apr 18, 2023
53f23fb
Fix style
ahendriksen Apr 18, 2023
c684a15
distance: Add back rbf_fin_op instances
ahendriksen Apr 18, 2023
6a6b1e5
Remove unused code
ahendriksen Apr 19, 2023
486c2e9
Remove spurious macros from brute_force instances
ahendriksen Apr 19, 2023
fe6d335
Add documentation for rbf_fin_op
ahendriksen Apr 19, 2023
be030b8
Add documentation for split header structure
ahendriksen Apr 19, 2023
eb0d3b2
Rename RAFT_EXPLICIT_INSTANTIATE => RAFT_EXPLICIT_INSTANTIATE_ONLY
ahendriksen Apr 19, 2023
0b9b796
Remove docstrings from -ext headers
ahendriksen Apr 19, 2023
e8fc2da
Fix extraneous and missing includes in headers
ahendriksen Apr 19, 2023
488a273
Add tests to ensure headers compile in isolation
ahendriksen Apr 19, 2023
f37c0e7
Add macro tables to developer guide
ahendriksen Apr 19, 2023
1b899f9
Merge remote-tracking branch 'rapids/branch-23.06' into enh-inl-headers
ahendriksen Apr 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ RAFT itself can be installed through conda, [CMake Package Manager (CPM)](https:

The easiest way to install RAFT is through conda and several packages are provided.
- `libraft-headers` RAFT headers
- `libraft` (optional) shared library of pre-compiled template specializations and runtime APIs.
- `libraft` (optional) shared library of pre-compiled template instantiations and runtime APIs.
- `pylibraft` (optional) Python wrappers around RAFT algorithms and primitives.
- `raft-dask` (optional) enables deployment of multi-node multi-GPU algorithms that use RAFT `raft::comms` in Dask clusters.

Expand Down Expand Up @@ -231,11 +231,11 @@ You can find an [example RAFT](cpp/template/README.md) project template in the `

Additional CMake targets can be made available by adding components in the table below to the `RAFT_COMPONENTS` list above, separated by spaces. The `raft::raft` target will always be available. RAFT headers require, at a minimum, the CUDA toolkit libraries and RMM dependencies.

| Component | Target | Description | Base Dependencies |
|-------------|---------------------|-----------------------------------------------------------|---------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template specializations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |
| Component | Target | Description | Base Dependencies |
|-------------|---------------------|----------------------------------------------------------|----------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template instantiations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |

### Source

Expand Down Expand Up @@ -282,7 +282,7 @@ The folder structure mirrors other RAPIDS repos, with the following folders:
- `util`: Various reusable tools and utilities for accelerated algorithm development
- `internal`: A private header-only component that hosts the code shared between benchmarks and tests.
- `scripts`: Helpful scripts for development
- `src`: Compiled APIs and template specializations for the shared libraries
- `src`: Compiled APIs and template instantiations for the shared libraries
- `template`: A skeleton template containing the bare-bones file structure and cmake configuration for writing applications with RAFT.
- `test`: Googletests source code
- `docs`: Source code and scripts for building library documentation (Uses breath, doxygen, & pydocs)
Expand Down
301 changes: 128 additions & 173 deletions cpp/CMakeLists.txt

Large diffs are not rendered by default.

6 changes: 1 addition & 5 deletions cpp/bench/ann/src/raft/raft_benchmark.cu
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@
#include <type_traits>
#include <utility>

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include "../common/ann_types.hpp"
#include "../common/benchmark_util.hpp"
#undef WARP_SIZE
Expand Down Expand Up @@ -220,4 +216,4 @@ std::unique_ptr<typename raft::bench::ann::ANN<T>::AnnSearchParam> create_search

#include "../common/benchmark.hpp"

int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); }
int main(int argc, char** argv) { return raft::bench::ann::run_main(argc, argv); }
6 changes: 1 addition & 5 deletions cpp/bench/ann/src/raft/raft_ivf_flat.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,8 @@
*/
#include "raft_ivf_flat_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfFlatGpu<float, int64_t>;
template class RaftIvfFlatGpu<uint8_t, int64_t>;
template class RaftIvfFlatGpu<int8_t, int64_t>;
} // namespace raft::bench::ann
} // namespace raft::bench::ann
1 change: 1 addition & 0 deletions cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#include <raft/neighbors/ivf_flat_types.hpp>
#include <raft/util/cudart_utils.hpp>
#include <rmm/device_uvector.hpp>
#include <rmm/mr/device/pool_memory_resource.hpp>
#include <stdexcept>
#include <string>
#include <type_traits>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/ann/src/raft/raft_ivf_pq.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,6 @@
*/
#include "raft_ivf_pq_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfPQ<float, int64_t>;
template class RaftIvfPQ<uint8_t, int64_t>;
Expand Down
8 changes: 8 additions & 0 deletions cpp/bench/prims/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ function(ConfigureBench)
${BENCH_NAME} PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${RAFT_CXX_FLAGS}>"
"$<$<COMPILE_LANGUAGE:CUDA>:${RAFT_CUDA_FLAGS}>"
)
target_compile_definitions(${BENCH_NAME} PRIVATE "RAFT_EXPLICIT_INSTANTIATE")

target_include_directories(
${BENCH_NAME} PUBLIC "$<BUILD_INTERFACE:${RAFT_SOURCE_DIR}/bench/prims>"
Expand Down Expand Up @@ -140,4 +141,11 @@ if(BUILD_PRIMS_BENCH)
OPTIONAL
LIB
)

add_custom_target(ALL_BENCH)
add_dependencies(
ALL_BENCH CLUSTER_BENCH DISTANCE_BENCH LINALG_BENCH MATRIX_BENCH NEIGHBORS_BENCH RANDOM_BENCH
SPARSE_BENCH TUNE_DISTANCE
)

endif()
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans.cuh>
#include <raft/cluster/kmeans_types.hpp>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBenchParams {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans_balanced.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans_balanced.cuh>
#include <raft/random/rng.cuh>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBalancedBenchParams {
Expand Down
3 changes: 0 additions & 3 deletions cpp/bench/prims/distance/distance_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@
#include <common/benchmark.hpp>
#include <raft/distance/distance.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
4 changes: 1 addition & 3 deletions cpp/bench/prims/distance/fused_l2_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,8 @@

#include <common/benchmark.hpp>
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/distance/kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

#include <common/benchmark.hpp>
#include <memory>
#include <raft/core/device_resources.hpp>
Expand Down
14 changes: 6 additions & 8 deletions cpp/bench/prims/distance/masked_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,12 @@
#include <raft/core/device_mdarray.hpp>
#include <raft/core/device_mdspan.hpp>
#include <raft/core/handle.hpp>
#include <raft/distance/detail/fused_l2_nn.cuh> // MinAndDistanceReduceOpImpl
#include <raft/distance/masked_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/random/rng.cuh>
#include <raft/util/cudart_utils.hpp>

#ifdef RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

namespace raft::bench::distance::masked_nn {

// Introduce various sparsity patterns
Expand Down Expand Up @@ -95,8 +92,8 @@ struct masked_l2_nn : public fixture {
using DataT = T;
using IdxT = int;
using OutT = raft::KeyValuePair<IdxT, DataT>;
using RedOpT = raft::distance::MinAndDistanceReduceOp<int, DataT>;
using PairRedOpT = raft::distance::KVPMinReduce<int, DataT>;
using RedOpT = raft::distance::detail::MinAndDistanceReduceOpImpl<int, DataT>;
using PairRedOpT = raft::distance::detail::KVPMinReduceImpl<int, DataT>;
using ParamT = raft::distance::masked_l2_nn_params<RedOpT, PairRedOpT>;

// Parameters
Expand Down Expand Up @@ -126,8 +123,9 @@ struct masked_l2_nn : public fixture {
xn.data_handle(), x.data_handle(), p.k, p.m, raft::linalg::L2Norm, true, stream);
raft::linalg::rowNorm(
yn.data_handle(), y.data_handle(), p.k, p.n, raft::linalg::L2Norm, true, stream);
raft::distance::initialize<T, raft::KeyValuePair<int, T>, int>(
handle, out.data_handle(), p.m, std::numeric_limits<T>::max(), RedOpT{});
// Avoid instantiating raft::distance::initialize..
raft::distance::detail::initialize<T, raft::KeyValuePair<int, T>, int>(
out.data_handle(), p.m, std::numeric_limits<T>::max(), RedOpT{}, handle.get_stream());

dim3 block(32, 32);
dim3 grid(10, 10);
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/matrix/select_k.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@
#include <raft/sparse/detail/utils.h>
#include <raft/util/cudart_utils.hpp>

#if defined RAFT_COMPILED
#include <raft/matrix/specializations.cuh>
#endif

#include <raft/matrix/detail/select_radix.cuh>
#include <raft/matrix/detail/select_warpsort.cuh>
#include <raft/matrix/select_k.cuh>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/knn.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
#include <raft/neighbors/ivf_pq.cuh>
#include <raft/spatial/knn/knn.cuh>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include <rmm/mr/device/managed_memory_resource.hpp>
#include <rmm/mr/device/per_device_resource.hpp>

Expand Down
5 changes: 0 additions & 5 deletions cpp/bench/prims/neighbors/refine_float_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations/refine.cuh>
#include <raft/spatial/knn/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/refine_uint8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
1 change: 1 addition & 0 deletions cpp/doxygen/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -918,6 +918,7 @@ EXCLUDE_SYMLINKS = NO
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories for example use the pattern */test/*

# TODO: remove specializations from exclude patterns when headers have been removed.
EXCLUDE_PATTERNS = */detail/* \
*/specializations/* \
*/thirdparty/*
Expand Down
1 change: 1 addition & 0 deletions cpp/include/raft/cluster/detail/kmeans_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
#include <raft/distance/distance.cuh>
#include <raft/distance/distance_types.hpp>
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/linalg/reduce_rows_by_key.cuh>
#include <raft/linalg/unary_op.cuh>
#include <raft/matrix/gather.cuh>
Expand Down
14 changes: 6 additions & 8 deletions cpp/include/raft/cluster/specializations.cuh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
* Copyright (c) 2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -13,12 +13,10 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __CLUSTER_SPECIALIZATIONS_H
#define __CLUSTER_SPECIALIZATIONS_H

#pragma once

#include <raft/distance/specializations.cuh>
#include <raft/neighbors/specializations.cuh>

#endif
#pragma message( \
__FILE__ \
" is deprecated and will be removed." \
" Including specializations is not necessary any more." \
" For more information, see: https://docs.rapids.ai/api/raft/nightly/using_libraft.html")
Loading