Skip to content

Commit

Permalink
[ENH] [FINAL] Header structure: combine all PRs into one (#1469)
Browse files Browse the repository at this point in the history
This is a rebase of all the commits in PRs:
- #1437 
- #1438 
- #1439 
- #1440 
- #1441 

The original PRs have not been rebased to preserve review comments. This PR is up to date with branch 23.06.

Closes #1416

Authors:
  - Allard Hendriksen (https://github.com/ahendriksen)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1469
  • Loading branch information
ahendriksen authored Apr 28, 2023
1 parent 082be6e commit fbce1a4
Show file tree
Hide file tree
Showing 431 changed files with 18,575 additions and 12,748 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ RAFT itself can be installed through conda, [CMake Package Manager (CPM)](https:

The easiest way to install RAFT is through conda and several packages are provided.
- `libraft-headers` RAFT headers
- `libraft` (optional) shared library of pre-compiled template specializations and runtime APIs.
- `libraft` (optional) shared library of pre-compiled template instantiations and runtime APIs.
- `pylibraft` (optional) Python wrappers around RAFT algorithms and primitives.
- `raft-dask` (optional) enables deployment of multi-node multi-GPU algorithms that use RAFT `raft::comms` in Dask clusters.

Expand Down Expand Up @@ -236,11 +236,11 @@ You can find an [example RAFT](cpp/template/README.md) project template in the `

Additional CMake targets can be made available by adding components in the table below to the `RAFT_COMPONENTS` list above, separated by spaces. The `raft::raft` target will always be available. RAFT headers require, at a minimum, the CUDA toolkit libraries and RMM dependencies.

| Component | Target | Description | Base Dependencies |
|-------------|---------------------|-----------------------------------------------------------|---------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template specializations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |
| Component | Target | Description | Base Dependencies |
|-------------|---------------------|----------------------------------------------------------|----------------------------------------|
| n/a | `raft::raft` | Full RAFT header library | CUDA toolkit, RMM, NVTX, CCCL, CUTLASS |
| compiled | `raft::compiled` | Pre-compiled template instantiations and runtime library | raft::raft |
| distributed | `raft::distributed` | Dependencies for `raft::comms` APIs | raft::raft, UCX, NCCL |

### Source

Expand Down Expand Up @@ -287,7 +287,7 @@ The folder structure mirrors other RAPIDS repos, with the following folders:
- `util`: Various reusable tools and utilities for accelerated algorithm development
- `internal`: A private header-only component that hosts the code shared between benchmarks and tests.
- `scripts`: Helpful scripts for development
- `src`: Compiled APIs and template specializations for the shared libraries
- `src`: Compiled APIs and template instantiations for the shared libraries
- `template`: A skeleton template containing the bare-bones file structure and cmake configuration for writing applications with RAFT.
- `test`: Googletests source code
- `docs`: Source code and scripts for building library documentation (Uses breath, doxygen, & pydocs)
Expand Down
306 changes: 133 additions & 173 deletions cpp/CMakeLists.txt

Large diffs are not rendered by default.

4 changes: 0 additions & 4 deletions cpp/bench/ann/src/raft/raft_benchmark.cu
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@
#include <type_traits>
#include <utility>

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include "../common/ann_types.hpp"
#include "../common/benchmark_util.hpp"
#undef WARP_SIZE
Expand Down
6 changes: 1 addition & 5 deletions cpp/bench/ann/src/raft/raft_ivf_flat.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,8 @@
*/
#include "raft_ivf_flat_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfFlatGpu<float, int64_t>;
template class RaftIvfFlatGpu<uint8_t, int64_t>;
template class RaftIvfFlatGpu<int8_t, int64_t>;
} // namespace raft::bench::ann
} // namespace raft::bench::ann
1 change: 1 addition & 0 deletions cpp/bench/ann/src/raft/raft_ivf_flat_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#include <raft/neighbors/ivf_flat_types.hpp>
#include <raft/util/cudart_utils.hpp>
#include <rmm/device_uvector.hpp>
#include <rmm/mr/device/pool_memory_resource.hpp>
#include <stdexcept>
#include <string>
#include <type_traits>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/ann/src/raft/raft_ivf_pq.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,6 @@
*/
#include "raft_ivf_pq_wrapper.h"

#ifdef RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

namespace raft::bench::ann {
template class RaftIvfPQ<float, int64_t>;
template class RaftIvfPQ<uint8_t, int64_t>;
Expand Down
12 changes: 9 additions & 3 deletions cpp/bench/prims/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

function(ConfigureBench)

set(options OPTIONAL LIB)
set(options OPTIONAL LIB EXPLICIT_INSTANTIATE_ONLY)
set(oneValueArgs NAME)
set(multiValueArgs PATH TARGETS CONFIGURATIONS)

Expand Down Expand Up @@ -55,6 +55,10 @@ function(ConfigureBench)
"$<$<COMPILE_LANGUAGE:CUDA>:${RAFT_CUDA_FLAGS}>"
)

if(ConfigureTest_EXPLICIT_INSTANTIATE_ONLY)
target_compile_definitions(${BENCH_NAME} PRIVATE "RAFT_EXPLICIT_INSTANTIATE_ONLY")
endif()

target_include_directories(
${BENCH_NAME} PUBLIC "$<BUILD_INTERFACE:${RAFT_SOURCE_DIR}/bench/prims>"
)
Expand All @@ -71,7 +75,7 @@ endfunction()
if(BUILD_PRIMS_BENCH)
ConfigureBench(
NAME CLUSTER_BENCH PATH bench/prims/cluster/kmeans_balanced.cu bench/prims/cluster/kmeans.cu
bench/prims/main.cpp OPTIONAL LIB
bench/prims/main.cpp OPTIONAL LIB EXPLICIT_INSTANTIATE_ONLY
)

ConfigureBench(
Expand All @@ -93,6 +97,7 @@ if(BUILD_PRIMS_BENCH)
bench/prims/main.cpp
OPTIONAL
LIB
EXPLICIT_INSTANTIATE_ONLY
)

ConfigureBench(
Expand All @@ -112,7 +117,7 @@ if(BUILD_PRIMS_BENCH)

ConfigureBench(
NAME MATRIX_BENCH PATH bench/prims/matrix/argmin.cu bench/prims/matrix/gather.cu
bench/prims/matrix/select_k.cu bench/prims/main.cpp OPTIONAL LIB
bench/prims/matrix/select_k.cu bench/prims/main.cpp OPTIONAL LIB EXPLICIT_INSTANTIATE_ONLY
)

ConfigureBench(
Expand All @@ -139,5 +144,6 @@ if(BUILD_PRIMS_BENCH)
bench/prims/main.cpp
OPTIONAL
LIB
EXPLICIT_INSTANTIATE_ONLY
)
endif()
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans.cuh>
#include <raft/cluster/kmeans_types.hpp>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBenchParams {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/cluster/kmeans_balanced.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,6 @@
#include <raft/cluster/kmeans_balanced.cuh>
#include <raft/random/rng.cuh>

#if defined RAFT_COMPILED
#include <raft/cluster/specializations.cuh>
#endif

namespace raft::bench::cluster {

struct KMeansBalancedBenchParams {
Expand Down
3 changes: 0 additions & 3 deletions cpp/bench/prims/distance/distance_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@
#include <common/benchmark.hpp>
#include <raft/distance/distance.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
3 changes: 0 additions & 3 deletions cpp/bench/prims/distance/fused_l2_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,6 @@
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/util/cudart_utils.hpp>
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif
#include <rmm/device_uvector.hpp>

namespace raft::bench::distance {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/distance/kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#if defined RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

#include <common/benchmark.hpp>
#include <memory>
#include <raft/core/device_resources.hpp>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/distance/masked_nn.cu
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,6 @@
#include <raft/random/rng.cuh>
#include <raft/util/cudart_utils.hpp>

#ifdef RAFT_COMPILED
#include <raft/distance/specializations.cuh>
#endif

namespace raft::bench::distance::masked_nn {

// Introduce various sparsity patterns
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/matrix/select_k.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,6 @@
#include <raft/sparse/detail/utils.h>
#include <raft/util/cudart_utils.hpp>

#if defined RAFT_COMPILED
#include <raft/matrix/specializations.cuh>
#endif

#include <raft/matrix/detail/select_radix.cuh>
#include <raft/matrix/detail/select_warpsort.cuh>
#include <raft/matrix/select_k.cuh>
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/knn.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
#include <raft/neighbors/ivf_pq.cuh>
#include <raft/spatial/knn/knn.cuh>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

#include <rmm/mr/device/managed_memory_resource.hpp>
#include <rmm/mr/device/per_device_resource.hpp>

Expand Down
5 changes: 0 additions & 5 deletions cpp/bench/prims/neighbors/refine_float_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations/refine.cuh>
#include <raft/spatial/knn/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
4 changes: 0 additions & 4 deletions cpp/bench/prims/neighbors/refine_uint8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,6 @@
#include "refine.cuh"
#include <common/benchmark.hpp>

#if defined RAFT_COMPILED
#include <raft/neighbors/specializations.cuh>
#endif

using namespace raft::neighbors;

namespace raft::bench::neighbors {
Expand Down
1 change: 1 addition & 0 deletions cpp/doxygen/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -918,6 +918,7 @@ EXCLUDE_SYMLINKS = NO
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories for example use the pattern */test/*

# TODO: remove specializations from exclude patterns when headers have been removed.
EXCLUDE_PATTERNS = */detail/* \
*/specializations/* \
*/thirdparty/*
Expand Down
1 change: 1 addition & 0 deletions cpp/include/raft/cluster/detail/kmeans_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
#include <raft/distance/distance.cuh>
#include <raft/distance/distance_types.hpp>
#include <raft/distance/fused_l2_nn.cuh>
#include <raft/linalg/norm.cuh>
#include <raft/linalg/reduce_rows_by_key.cuh>
#include <raft/linalg/unary_op.cuh>
#include <raft/matrix/gather.cuh>
Expand Down
12 changes: 5 additions & 7 deletions cpp/include/raft/cluster/specializations.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,10 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __CLUSTER_SPECIALIZATIONS_H
#define __CLUSTER_SPECIALIZATIONS_H

#pragma once

#include <raft/distance/specializations.cuh>
#include <raft/neighbors/specializations.cuh>

#endif
#pragma message( \
__FILE__ \
" is deprecated and will be removed." \
" Including specializations is not necessary any more." \
" For more information, see: https://docs.rapids.ai/api/raft/nightly/using_libraft.html")
36 changes: 35 additions & 1 deletion cpp/include/raft/core/detail/macros.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -40,6 +40,40 @@
#define RAFT_INLINE_FUNCTION _RAFT_HOST_DEVICE _RAFT_FORCEINLINE
#endif

// The RAFT_INLINE_CONDITIONAL is a conditional inline specifier that removes
// the inline specification when RAFT_COMPILED is defined.
//
// When RAFT_COMPILED is not defined, functions may be defined in multiple
// translation units and we do not want that to lead to linker errors.
//
// When RAFT_COMPILED is defined, this serves two purposes:
//
// 1. It triggers a multiple definition error message when memory_pool-inl.hpp
// (for instance) is accidentally included in multiple translation units.
//
// 2. We function definitions to be non-inline, because non-inline functions
// symbols are always exported in the object symbol table. For inline functions,
// the compiler may elide the external symbol, which results in linker errors.
#ifdef RAFT_COMPILED
#define RAFT_INLINE_CONDITIONAL
#else
#define RAFT_INLINE_CONDITIONAL inline
#endif // RAFT_COMPILED

// The RAFT_WEAK_FUNCTION specificies that:
//
// 1. A function may be defined in multiple translation units (like inline)
//
// 2. Must still emit an external symbol (unlike inline). This enables declaring
// a function signature in an `-ext` header and defining it in a source file.
//
// From
// https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes:
//
// "The weak attribute causes a declaration of an external symbol to be emitted
// as a weak symbol rather than a global."
#define RAFT_WEAK_FUNCTION __attribute__((weak))

/**
* Some macro magic to remove optional parentheses of a macro argument.
* See https://stackoverflow.com/a/62984543
Expand Down
Loading

0 comments on commit fbce1a4

Please sign in to comment.