diff --git a/README.md b/README.md index cc32e4d404..79ab874c27 100755 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ pairwise_distance(in1, in2, output, metric="euclidean") ## Installing -RAFT itself can be installed through conda, [Cmake Package Manager (CPM)](https://github.com/cpm-cmake/CPM.cmake), or by building the repository from source. Please refer to the [build instructions](BUILD.md) for more a comprehensive guide on building RAFT and using it in downstream projects. +RAFT itself can be installed through conda, [Cmake Package Manager (CPM)](https://github.com/cpm-cmake/CPM.cmake), or by building the repository from source. Please refer to the [build instructions](docs/source/build.md) for more a comprehensive guide on building RAFT and using it in downstream projects. ### Conda @@ -119,7 +119,7 @@ You can also install the `libraft-*` conda packages individually using the `mamb After installing RAFT, `find_package(raft COMPONENTS nn distance)` can be used in your CUDA/C++ cmake build to compile and/or link against needed dependencies in your raft target. `COMPONENTS` are optional and will depend on the packages installed. -### CPM +### Cmake & CPM RAFT uses the [RAPIDS-CMake](https://github.com/rapidsai/rapids-cmake) library, which makes it simple to include in downstream cmake projects. RAPIDS CMake provides a convenience layer around CPM. @@ -186,7 +186,7 @@ mamba activate raft_dev_env ./build.sh raft-dask pylibraft libraft tests bench --compile-libs ``` -The [build](BUILD.md) instructions contain more details on building RAFT from source and including it in downstream projects. You can also find a more comprehensive version of the above CPM code snippet the [Building RAFT C++ from source](BUILD.md#build_cxx_source) section of the build instructions. +The [build](docs/source/build.md) instructions contain more details on building RAFT from source and including it in downstream projects. You can also find a more comprehensive version of the above CPM code snippet the [Building RAFT C++ from source](docs/source/build.md#building-raft-c-from-source-in-cmake) section of the build instructions. ## Folder Structure and Contents @@ -220,7 +220,7 @@ The folder structure mirrors other RAPIDS repos, with the following folders: - `scripts`: Helpful scripts for development - `src`: Compiled APIs and template specializations for the shared libraries - `test`: Googletests source code -- `docs`: Source code and scripts for building library documentation (doxygen + pydocs) +- `docs`: Source code and scripts for building library documentation (Uses breath, doxygen, & pydocs) - `python`: Source code for Python libraries. - `pylibraft`: Python build and source code for pylibraft library - `raft-dask`: Python build and source code for raft-dask library diff --git a/build.sh b/build.sh index 9548fbec44..61e6d1a007 100755 --- a/build.sh +++ b/build.sh @@ -227,18 +227,50 @@ fi if hasArg tests || (( ${NUMARGS} == 0 )); then BUILD_TESTS=ON - COMPILE_DIST_LIBRARY=ON - ENABLE_NN_DEPENDENCIES=ON - COMPILE_NN_LIBRARY=ON CMAKE_TARGET="${CMAKE_TARGET};${TEST_TARGETS}" + + # Force compile nn library when needed test targets are specified + if [[ $CMAKE_TARGET == *"CLUSTER_TEST"* || \ + $CMAKE_TARGET == *"SPARSE_DIST_TEST"* || \ + $CMAKE_TARGET == *"SPARSE_NEIGHBORS_TEST"* || \ + $CMAKE_TARGET == *"NEIGHBORS_TEST"* || \ + $CMAKE_TARGET == *"STATS_TEST"* ]]; then + echo "-- Enabling nearest neighbors lib for gtests" + ENABLE_NN_DEPENDENCIES=ON + COMPILE_NN_LIBRARY=ON + fi + + # Force compile distance library when needed test targets are specified + if [[ $CMAKE_TARGET == *"CLUSTER_TEST"* || \ + $CMAKE_TARGET == *"DISTANCE_TEST"* || \ + $CMAKE_TARGET == *"SPARSE_DIST_TEST" || \ + $CMAKE_TARGET == *"SPARSE_NEIGHBORS_TEST"* || \ + $CMAKE_TARGET == *"NEIGHBORS_TEST" || \ + $CMAKE_TARGET == *"STATS_TEST"* ]]; then + echo "-- Enabling distance lib for gtests" + COMPILE_DIST_LIBRARY=ON + fi fi if hasArg bench || (( ${NUMARGS} == 0 )); then BUILD_BENCH=ON - COMPILE_DIST_LIBRARY=ON - ENABLE_NN_DEPENDENCIES=ON - COMPILE_NN_LIBRARY=ON CMAKE_TARGET="${CMAKE_TARGET};${BENCH_TARGETS}" + + # Force compile nn library when needed benchmark targets are specified + if [[ $CMAKE_TARGET == *"CLUSTER_BENCH"* || \ + $CMAKE_TARGET == *"NEIGHBORS_BENCH"* ]]; then + echo "-- Enabling nearest neighbors lib for benchmarks" + ENABLE_NN_DEPENDENCIES=ON + COMPILE_NN_LIBRARY=ON + fi + + # Force compile distance library when needed benchmark targets are specified + if [[ $CMAKE_TARGET == *"CLUSTER_BENCH"* || \ + $CMAKE_TARGET == *"NEIGHBORS_BENCH"* ]]; then + echo "-- Enabling distance lib for benchmarks" + COMPILE_DIST_LIBRARY=ON + fi + fi if hasArg --buildfaiss; then diff --git a/cpp/doxygen/Doxyfile.in b/cpp/doxygen/Doxyfile.in index 5517562a9f..07056e503d 100644 --- a/cpp/doxygen/Doxyfile.in +++ b/cpp/doxygen/Doxyfile.in @@ -900,7 +900,9 @@ EXCLUDE = @CMAKE_CURRENT_SOURCE_DIR@/include/raft/sparse/linalg/s @CMAKE_CURRENT_SOURCE_DIR@/include/raft/span.hpp \ @CMAKE_CURRENT_SOURCE_DIR@/include/raft/vectorized.cuh \ @CMAKE_CURRENT_SOURCE_DIR@/include/raft/raft.hpp \ - @CMAKE_CURRENT_SOURCE_DIR@/include/raft/core/cudart_utils.hpp + @CMAKE_CURRENT_SOURCE_DIR@/include/raft/core/cudart_utils.hpp \ + @CMAKE_CURRENT_SOURCE_DIR@/include/raft/matrix/math.cuh \ + @CMAKE_CURRENT_SOURCE_DIR@/include/raft/matrix/matrix.cuh # The EXCLUDE_SYMLINKS tag can be used to select whether or not files or # directories that are symbolic links (a Unix file system feature) are excluded diff --git a/cpp/include/raft/cluster/kmeans.cuh b/cpp/include/raft/cluster/kmeans.cuh index 2025a15ecf..ef1fb44dfd 100644 --- a/cpp/include/raft/cluster/kmeans.cuh +++ b/cpp/include/raft/cluster/kmeans.cuh @@ -23,11 +23,45 @@ namespace raft::cluster::kmeans { +/** + * Functor used for sampling centroids + */ +template +using SamplingOp = detail::SamplingOp; + +/** + * Functor used to extract the index from a KeyValue pair + * storing both index and a distance. + */ +template +using KeyValueIndexOp = detail::KeyValueIndexOp; + /** * @brief Find clusters with k-means algorithm. * Initial centroids are chosen with k-means++ algorithm. Empty * clusters are reinitialized by choosing new centroids with * k-means++ algorithm. + * + * @code{.cpp} + * #include + * #include + * #include + * using namespace raft::cluster; + * ... + * raft::handle_t handle; + * raft::cluster::KMeansParams params; + * int n_features = 15, inertia, n_iter; + * auto centroids = raft::make_device_matrix(handle, params.n_clusters, n_features); + * + * kmeans::fit(handle, + * params, + * X, + * std::nullopt, + * centroids, + * raft::make_scalar_view(&inertia), + * raft::make_scalar_view(&n_iter)); + * @endcode + * * @tparam DataT the type of data used for weights, distances. * @tparam IndexT the type of data used for indexing. * @param[in] handle The raft handle. @@ -47,7 +81,7 @@ namespace raft::cluster::kmeans { * closest cluster center. * @param[out] n_iter Number of iterations run. */ -template +template void fit(handle_t const& handle, const KMeansParams& params, raft::device_matrix_view X, @@ -59,23 +93,40 @@ void fit(handle_t const& handle, detail::kmeans_fit(handle, params, X, sample_weight, centroids, inertia, n_iter); } -template -void fit(handle_t const& handle, - const KMeansParams& params, - const DataT* X, - const DataT* sample_weight, - DataT* centroids, - IndexT n_samples, - IndexT n_features, - DataT& inertia, - IndexT& n_iter) -{ - detail::kmeans_fit( - handle, params, X, sample_weight, centroids, n_samples, n_features, inertia, n_iter); -} - /** * @brief Predict the closest cluster each sample in X belongs to. + * + * @code{.cpp} + * #include + * #include + * #include + * using namespace raft::cluster; + * ... + * raft::handle_t handle; + * raft::cluster::KMeansParams params; + * int n_features = 15, inertia, n_iter; + * auto centroids = raft::make_device_matrix(handle, params.n_clusters, n_features); + * + * kmeans::fit(handle, + * params, + * X, + * std::nullopt, + * centroids.view(), + * raft::make_scalar_view(&inertia), + * raft::make_scalar_view(&n_iter)); + * ... + * auto labels = raft::make_device_vector(handle, X.extent(0)); + * + * kmeans::predict(handle, + * params, + * X, + * std::nullopt, + * centroids.view(), + * false, + * labels.view(), + * raft::make_scalar_view(&ineratia)); + * @endcode + * * @tparam DataT the type of data used for weights, distances. * @tparam IndexT the type of data used for indexing. * @param[in] handle The raft handle. @@ -94,7 +145,7 @@ void fit(handle_t const& handle, * @param[out] inertia Sum of squared distances of samples to * their closest cluster center. */ -template +template void predict(handle_t const& handle, const KMeansParams& params, raft::device_matrix_view X, @@ -108,34 +159,32 @@ void predict(handle_t const& handle, handle, params, X, sample_weight, centroids, labels, normalize_weight, inertia); } -template -void predict(handle_t const& handle, - const KMeansParams& params, - const DataT* X, - const DataT* sample_weight, - const DataT* centroids, - IndexT n_samples, - IndexT n_features, - IndexT* labels, - bool normalize_weight, - DataT& inertia) -{ - detail::kmeans_predict(handle, - params, - X, - sample_weight, - centroids, - n_samples, - n_features, - labels, - normalize_weight, - inertia); -} - /** * @brief Compute k-means clustering and predicts cluster index for each sample * in the input. * + * @code{.cpp} + * #include + * #include + * #include + * using namespace raft::cluster; + * ... + * raft::handle_t handle; + * raft::cluster::KMeansParams params; + * int n_features = 15, inertia, n_iter; + * auto centroids = raft::make_device_matrix(handle, params.n_clusters, n_features); + * auto labels = raft::make_device_vector(handle, X.extent(0)); + * + * kmeans::fit_predict(handle, + * params, + * X, + * std::nullopt, + * centroids.view(), + * labels.view(), + * raft::make_scalar_view(&inertia), + * raft::make_scalar_view(&n_iter)); + * @endcode + * * @tparam DataT the type of data used for weights, distances. * @tparam IndexT the type of data used for indexing. * @param[in] handle The raft handle. @@ -159,7 +208,7 @@ void predict(handle_t const& handle, * closest cluster center. * @param[out] n_iter Number of iterations run. */ -template +template void fit_predict(handle_t const& handle, const KMeansParams& params, raft::device_matrix_view X, @@ -173,22 +222,6 @@ void fit_predict(handle_t const& handle, handle, params, X, sample_weight, centroids, labels, inertia, n_iter); } -template -void fit_predict(handle_t const& handle, - const KMeansParams& params, - const DataT* X, - const DataT* sample_weight, - DataT* centroids, - IndexT n_samples, - IndexT n_features, - IndexT* labels, - DataT& inertia, - IndexT& n_iter) -{ - detail::kmeans_fit_predict( - handle, params, X, sample_weight, centroids, n_samples, n_features, labels, inertia, n_iter); -} - /** * @brief Transform X to a cluster-distance space. * @@ -204,7 +237,7 @@ void fit_predict(handle_t const& handle, * @param[out] X_new X transformed in the new space. * [dim = n_samples x n_features] */ -template +template void transform(const raft::handle_t& handle, const KMeansParams& params, raft::device_matrix_view X, @@ -214,7 +247,7 @@ void transform(const raft::handle_t& handle, detail::kmeans_transform(handle, params, X, centroids, X_new); } -template +template void transform(const raft::handle_t& handle, const KMeansParams& params, const DataT* X, @@ -227,12 +260,6 @@ void transform(const raft::handle_t& handle, handle, params, X, centroids, n_samples, n_features, X_new); } -template -using SamplingOp = detail::SamplingOp; - -template -using KeyValueIndexOp = detail::KeyValueIndexOp; - /** * @brief Select centroids according to a sampling operation * @@ -252,7 +279,7 @@ using KeyValueIndexOp = detail::KeyValueIndexOp; * @param[in] workspace Temporary workspace buffer which can get resized * */ -template +template void sample_centroids(const raft::handle_t& handle, raft::device_matrix_view X, raft::device_vector_view minClusterDistance, @@ -279,7 +306,7 @@ void sample_centroids(const raft::handle_t& handle, * @param[in] reduction_op The reduction operation used for the cost * */ -template +template void cluster_cost(const raft::handle_t& handle, raft::device_vector_view minClusterDistance, rmm::device_uvector workspace, @@ -424,11 +451,10 @@ void count_samples_in_cluster(const raft::handle_t& handle, handle, params, X, L2NormX, centroids, workspace, sampleCountInCluster); } -/* +/** * @brief Selects 'n_clusters' samples from the input X using kmeans++ algorithm. - - * @note This is the algorithm described in - * "k-means++: the advantages of careful seeding". 2007, Arthur, D. and Vassilvitskii, S. + * + * @see "k-means++: the advantages of careful seeding". 2007, Arthur, D. and Vassilvitskii, S. * ACM-SIAM symposium on Discrete algorithms. * * @tparam DataT the type of data used for weights, distances. @@ -446,10 +472,10 @@ template void init_plus_plus(const raft::handle_t& handle, const KMeansParams& params, raft::device_matrix_view X, - raft::device_matrix_view centroidsRawData, + raft::device_matrix_view centroids, rmm::device_uvector& workspace) { - detail::kmeansPlusPlus(handle, params, X, centroidsRawData, workspace); + detail::kmeansPlusPlus(handle, params, X, centroids, workspace); } /* @@ -480,13 +506,13 @@ void fit_main(const raft::handle_t& handle, const KMeansParams& params, raft::device_matrix_view X, raft::device_vector_view weight, - raft::device_matrix_view centroidsRawData, + raft::device_matrix_view centroids, raft::host_scalar_view inertia, raft::host_scalar_view n_iter, rmm::device_uvector& workspace) { detail::kmeans_fit_main( - handle, params, X, weight, centroidsRawData, inertia, n_iter, workspace); + handle, params, X, weight, centroids, inertia, n_iter, workspace); } }; // end namespace raft::cluster::kmeans @@ -701,7 +727,7 @@ void kmeans_transform(const raft::handle_t& handle, kmeans::transform(handle, params, X, centroids, n_samples, n_features, X_new); } -template +template using SamplingOp = kmeans::SamplingOp; template diff --git a/cpp/include/raft/cluster/kmeans_params.hpp b/cpp/include/raft/cluster/kmeans_params.hpp index 433e32f5ff..a1532d9dd4 100644 --- a/cpp/include/raft/cluster/kmeans_params.hpp +++ b/cpp/include/raft/cluster/kmeans_params.hpp @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020-2022, NVIDIA CORPORATION. + * Copyright (c) 2022, NVIDIA CORPORATION. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -13,15 +13,6 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -/** - * This file is deprecated and will be removed in release 22.06. - * Please use the cuh version instead. - */ - -/** - * DISCLAIMER: this file is deprecated: use lap.cuh instead - */ - #pragma once #pragma message(__FILE__ \ diff --git a/cpp/include/raft/cluster/kmeans_types.hpp b/cpp/include/raft/cluster/kmeans_types.hpp index d6eadd1ba6..f411b12b5c 100644 --- a/cpp/include/raft/cluster/kmeans_types.hpp +++ b/cpp/include/raft/cluster/kmeans_types.hpp @@ -20,14 +20,34 @@ namespace raft::cluster::kmeans { +/** + * Simple object to specify hyper-parameters to the kmeans algorithm. + */ struct KMeansParams { - enum InitMethod { KMeansPlusPlus, Random, Array }; - - // The number of clusters to form as well as the number of centroids to - // generate (default:8). + enum InitMethod { + + /** + * Sample the centroids using the kmeans++ strategy + */ + KMeansPlusPlus, + + /** + * Sample the centroids uniformly at random + */ + Random, + + /** + * User provides the array of initial centroids + */ + Array + }; + + /** + * The number of clusters to form as well as the number of centroids to generate (default:8). + */ int n_clusters = 8; - /* + /** * Method for initialization, defaults to k-means++: * - InitMethod::KMeansPlusPlus (k-means++): Use scalable k-means++ algorithm * to select the initial cluster centers. @@ -37,34 +57,52 @@ struct KMeansParams { */ InitMethod init = KMeansPlusPlus; - // Maximum number of iterations of the k-means algorithm for a single run. + /** + * Maximum number of iterations of the k-means algorithm for a single run. + */ int max_iter = 300; - // Relative tolerance with regards to inertia to declare convergence. + /** + * Relative tolerance with regards to inertia to declare convergence. + */ double tol = 1e-4; - // verbosity level. + /** + * verbosity level. + */ int verbosity = RAFT_LEVEL_INFO; - // Seed to the random number generator. + /** + * Seed to the random number generator. + */ raft::random::RngState rng_state = raft::random::RngState(0, raft::random::GeneratorType::GenPhilox); - // Metric to use for distance computation. + /** + * Metric to use for distance computation. + */ raft::distance::DistanceType metric = raft::distance::DistanceType::L2Expanded; - // Number of instance k-means algorithm will be run with different seeds. + /** + * Number of instance k-means algorithm will be run with different seeds. + */ int n_init = 1; - // Oversampling factor for use in the k-means|| algorithm. + /** + * Oversampling factor for use in the k-means|| algorithm + */ double oversampling_factor = 2.0; // batch_samples and batch_centroids are used to tile 1NN computation which is // useful to optimize/control the memory footprint // Default tile is [batch_samples x n_clusters] i.e. when batch_centroids is 0 // then don't tile the centroids - int batch_samples = 1 << 15; - int batch_centroids = 0; // if 0 then batch_centroids = n_clusters + int batch_samples = 1 << 15; + + /** + * if 0 then batch_centroids = n_clusters + */ + int batch_centroids = 0; // bool inertia_check = false; }; diff --git a/cpp/include/raft/cluster/single_linkage_types.hpp b/cpp/include/raft/cluster/single_linkage_types.hpp index 55239ff6d6..9a4fcfef60 100644 --- a/cpp/include/raft/cluster/single_linkage_types.hpp +++ b/cpp/include/raft/cluster/single_linkage_types.hpp @@ -19,17 +19,35 @@ #include namespace raft::cluster::hierarchy { -enum LinkageDistance { PAIRWISE = 0, KNN_GRAPH = 1 }; + +/** + * Determines the method for computing the minimum spanning tree (MST) + */ +enum LinkageDistance { + + /** + * Use a pairwise distance matrix as input to the mst. This + * is very fast and the best option for fairly small datasets (~50k data points) + */ + PAIRWISE = 0, + + /** + * Construct a KNN graph as input to the mst and provide additional + * edges if the mst does not converge. This is slower but scales + * to very large datasets. + */ + KNN_GRAPH = 1 +}; }; // end namespace raft::cluster::hierarchy -// The code below is legacy +// The code below is now considered legacy namespace raft::cluster { using hierarchy::LinkageDistance; /** - * Simple POCO for consolidating linkage results. This closely + * Simple container object for consolidating linkage results. This closely * mirrors the trained instance variables populated in * Scikit-learn's AgglomerativeClustering estimator. * @tparam value_idx @@ -64,4 +82,4 @@ class linkage_output_int : public linkage_output { class linkage_output_int64 : public linkage_output { }; -}; // namespace raft::cluster \ No newline at end of file +}; // namespace raft::cluster diff --git a/cpp/include/raft/core/detail/device_mdarray.hpp b/cpp/include/raft/core/detail/device_mdarray.hpp index ff7c31000d..ad6831794e 100644 --- a/cpp/include/raft/core/detail/device_mdarray.hpp +++ b/cpp/include/raft/core/detail/device_mdarray.hpp @@ -25,8 +25,8 @@ #include #include -#include #include // dynamic_extent +#include #include #include diff --git a/cpp/include/raft/core/device_mdspan.hpp b/cpp/include/raft/core/device_mdspan.hpp index 2fc43e2a05..ffbbe43d01 100644 --- a/cpp/include/raft/core/device_mdspan.hpp +++ b/cpp/include/raft/core/device_mdspan.hpp @@ -16,16 +16,16 @@ #pragma once -#include +#include #include namespace raft { template -using device_accessor = detail::host_device_accessor; +using device_accessor = host_device_accessor; template -using managed_accessor = detail::host_device_accessor; +using managed_accessor = host_device_accessor; /** * @brief std::experimental::mdspan with device tag to avoid accessing incorrect memory location. diff --git a/cpp/include/raft/core/detail/host_device_accessor.hpp b/cpp/include/raft/core/host_device_accessor.hpp similarity index 86% rename from cpp/include/raft/core/detail/host_device_accessor.hpp rename to cpp/include/raft/core/host_device_accessor.hpp index 3a71e6366b..4f6f559be4 100644 --- a/cpp/include/raft/core/detail/host_device_accessor.hpp +++ b/cpp/include/raft/core/host_device_accessor.hpp @@ -16,10 +16,12 @@ #pragma once -namespace raft::detail { +namespace raft { /** - * @brief A mixin to distinguish host and device memory. + * @brief A mixin to distinguish host and device memory. This is the primary + * accessor used throught RAFT's APIs to denote whether an underlying pointer + * is accessible from device, host, or both. */ template struct host_device_accessor : public AccessorPolicy { @@ -36,4 +38,4 @@ struct host_device_accessor : public AccessorPolicy { host_device_accessor(AccessorPolicy const& that) : AccessorPolicy{that} {} // NOLINT }; -} // namespace raft::detail +} // namespace raft diff --git a/cpp/include/raft/core/host_mdspan.hpp b/cpp/include/raft/core/host_mdspan.hpp index fc2a9bbd6d..3e76dbb9ce 100644 --- a/cpp/include/raft/core/host_mdspan.hpp +++ b/cpp/include/raft/core/host_mdspan.hpp @@ -18,12 +18,12 @@ #include -#include +#include namespace raft { template -using host_accessor = detail::host_device_accessor; +using host_accessor = host_device_accessor; /** * @brief std::experimental::mdspan with host tag to avoid accessing incorrect memory location. diff --git a/cpp/include/raft/core/mdarray.hpp b/cpp/include/raft/core/mdarray.hpp index 44730d901e..ae5d236395 100644 --- a/cpp/include/raft/core/mdarray.hpp +++ b/cpp/include/raft/core/mdarray.hpp @@ -24,8 +24,8 @@ #include -#include #include +#include #include #include #include @@ -154,13 +154,12 @@ class mdarray std::conditional_t, typename container_policy_type::const_accessor_policy, typename container_policy_type::accessor_policy>> - using view_type_impl = - mdspan>; + using view_type_impl = mdspan>; public: /** diff --git a/cpp/include/raft/core/mdspan.hpp b/cpp/include/raft/core/mdspan.hpp index a858633e07..1b98a7a937 100644 --- a/cpp/include/raft/core/mdspan.hpp +++ b/cpp/include/raft/core/mdspan.hpp @@ -18,9 +18,9 @@ #include #include -#include #include #include +#include #include @@ -149,10 +149,9 @@ template auto make_mdspan(ElementType* ptr, extents exts) { - using accessor_type = - detail::host_device_accessor, - is_host_accessible, - is_device_accessible>; + using accessor_type = host_device_accessor, + is_host_accessible, + is_device_accessible>; return mdspan{ptr, exts}; } diff --git a/cpp/include/raft/neighbors/ball_cover.cuh b/cpp/include/raft/neighbors/ball_cover.cuh index 780a9cfce2..28ff8491b6 100644 --- a/cpp/include/raft/neighbors/ball_cover.cuh +++ b/cpp/include/raft/neighbors/ball_cover.cuh @@ -30,6 +30,23 @@ namespace raft::neighbors::ball_cover { /** * Builds and populates a previously unbuilt BallCoverIndex + * + * Usage example: + * @code{.cpp} + * + * #include + * #include + * #include + * using namespace raft::neighbors; + * + * raft::handle_t handle; + * ... + * auto metric = raft::distance::DistanceType::L2Expanded; + * BallCoverIndex index(handle, X, metric); + * + * ball_cover::build_index(handle, index); + * @endcode + * * @tparam idx_t knn index type * @tparam value_t knn value type * @tparam int_t integral type for knn params @@ -130,10 +147,31 @@ void all_knn_query(const raft::handle_t& handle, * the index and query are the same array. This function will * build the index and assumes rbc_build_index() has not already * been called. + * + * Usage example: + * @code{.cpp} + * + * #include + * #include + * #include + * using namespace raft::neighbors; + * + * raft::handle_t handle; + * ... + * auto metric = raft::distance::DistanceType::L2Expanded; + * + * // Construct a ball cover index + * BallCoverIndex index(handle, X, metric); + * + * // Perform all neighbors knn query + * ball_cover::all_knn_query(handle, index, inds, dists, k); + * @endcode + * * @tparam idx_t knn index type * @tparam value_t knn distance type * @tparam int_t type for integers, such as number of rows/cols * @tparam matrix_idx_t matrix indexing type + * * @param[in] handle raft handle for resource management * @param[in] index ball cover index which has not yet been built * @param[out] inds output knn indices @@ -250,6 +288,28 @@ void knn_query(const raft::handle_t& handle, * function does not build the index and assumes rbc_build_index() has * already been called. Use this function when the index and * query arrays are different, otherwise use rbc_all_knn_query(). + * + * Usage example: + * @code{.cpp} + * + * #include + * #include + * #include + * using namespace raft::neighbors; + * + * raft::handle_t handle; + * ... + * auto metric = raft::distance::DistanceType::L2Expanded; + * + * // Build a ball cover index + * BallCoverIndex index(handle, X, metric); + * ball_cover::build_index(handle, index); + * + * // Perform all neighbors knn query + * ball_cover::knn_query(handle, index, inds, dists, k); + * @endcode + + * * @tparam idx_t index type * @tparam value_t distances type * @tparam int_t integer type for size info diff --git a/cpp/include/raft/neighbors/brute_force.cuh b/cpp/include/raft/neighbors/brute_force.cuh index 3641a38991..772ccb67d2 100644 --- a/cpp/include/raft/neighbors/brute_force.cuh +++ b/cpp/include/raft/neighbors/brute_force.cuh @@ -23,26 +23,52 @@ namespace raft::neighbors::brute_force { /** - * @brief Performs a k-select across row partitioned index/distance + * @brief Performs a k-select across several (contiguous) row-partitioned index/distance * matrices formatted like the following: - * row1: k0, k1, k2 - * row2: k0, k1, k2 - * row3: k0, k1, k2 - * row1: k0, k1, k2 - * row2: k0, k1, k2 - * row3: k0, k1, k2 * + * part1row1: k0, k1, k2, k3 + * part1row2: k0, k1, k2, k3 + * part1row3: k0, k1, k2, k3 + * part2row1: k0, k1, k2, k3 + * part2row2: k0, k1, k2, k3 + * part2row3: k0, k1, k2, k3 * etc... * + * The example above shows what an aggregated index/distance matrix + * would look like with two partitions when n_samples=3 and k=4. + * + * When working with extremely large data sets that have been broken + * over multiple indexes, such as when computing over multiple GPUs, + * the ids will often start at 0 for each local knn index but the + * global ids need to be used when merging them together. An optional + * translations vector can be supplied to map the starting id of + * each partition to its global id so that the final merged knn + * is based on the global ids. + * + * Usage example: + * @code{.cpp} + * #include + * #include + * using namespace raft::neighbors; + * + * raft::handle_t handle; + * ... + * compute multiple knn graphs and aggregate row-wise + * (see detailed description above) + * ... + * brute_force::knn_merge_parts(handle, in_keys, in_values, out_keys, out_values, n_samples); + * @endcode + * * @tparam idx_t * @tparam value_t + * * @param[in] handle * @param[in] in_keys matrix of input keys (size n_samples * n_parts * k) * @param[in] in_values matrix of input values (size n_samples * n_parts * k) * @param[out] out_keys matrix of output keys (size n_samples * k) * @param[out] out_values matrix of output values (size n_samples * k) - * @param[in] n_samples number of rows in each part - * @param[in] translations optional vector of starting index mappings for each partition + * @param[in] n_samples number of rows in each partition + * @param[in] translations optional vector of starting global id mappings for each local partition */ template inline void knn_merge_parts( @@ -81,17 +107,31 @@ inline void knn_merge_parts( * row- or column-major but the output matrices will always be in * row-major format. * - * @param[in] handle the cuml handle to use - * @param[in] index vector of device matrices (each size m_i*d) to be used as the knn index - * @param[in] search matrix (size n*d) to be used for searching the index - * @param[out] indices matrix (size n*k) to store output knn indices - * @param[out] distances matrix (size n*k) to store the output knn distance - * @param[in] k the number of nearest neighbors to return - * @param[in] metric distance metric to use. Euclidean (L2) is used by default - * @param[in] metric_arg the value of `p` for Minkowski (l-p) distances. This + * Usage example: + * @code{.cpp} + * #include + * #include + * #include + * using namespace raft::neighbors; + * + * raft::handle_t handle; + * ... + * int k = 10; + * auto metric = raft::distance::DistanceType::L2SqrtExpanded; + * brute_force::knn(handle, index, search, indices, distances, k, metric); + * @endcode + * + * @param[in] handle: the cuml handle to use + * @param[in] index: vector of device matrices (each size m_i*d) to be used as the knn index + * @param[in] search: matrix (size n*d) to be used for searching the index + * @param[out] indices: matrix (size n*k) to store output knn indices + * @param[out] distances: matrix (size n*k) to store the output knn distance + * @param[in] k: the number of nearest neighbors to return + * @param[in] metric: distance metric to use. Euclidean (L2) is used by default + * @param[in] metric_arg: the value of `p` for Minkowski (l-p) distances. This * is ignored if the metric_type is not Minkowski. - * @param[in] translations starting offsets for partitions. should be the same size - * as input vector. + * @param[in] global_id_offset: optional starting global id mapping for the local partition + * (assumes the index contains contiguous ids in the global id space) */ template indices, raft::device_matrix_view distances, value_int k, - distance::DistanceType metric = distance::DistanceType::L2Unexpanded, - std::optional metric_arg = std::make_optional(2.0f), - std::optional> translations = std::nullopt) + distance::DistanceType metric = distance::DistanceType::L2Unexpanded, + std::optional metric_arg = std::make_optional(2.0f), + std::optional global_id_offset = std::nullopt) { RAFT_EXPECTS(index[0].extent(1) == search.extent(1), "Number of dimensions for both index and search matrices must be equal"); @@ -129,7 +169,10 @@ void knn(raft::handle_t const& handle, sizes.push_back(index[i].extent(0)); } - std::vector* trans = translations.has_value() ? &(*translations) : nullptr; + std::vector trans; + if (global_id_offset.has_value()) { trans.push_back(global_id_offset.value()); } + + std::vector* trans_arg = global_id_offset.has_value() ? &trans : nullptr; raft::spatial::knn::detail::brute_force_knn_impl(handle, inputs, @@ -143,7 +186,7 @@ void knn(raft::handle_t const& handle, k, rowMajorIndex, rowMajorQuery, - trans, + trans_arg, metric, metric_arg.value_or(2.0f)); } diff --git a/cpp/include/raft/neighbors/epsilon_neighborhood.cuh b/cpp/include/raft/neighbors/epsilon_neighborhood.cuh index b0e9b842ec..114216fc50 100644 --- a/cpp/include/raft/neighbors/epsilon_neighborhood.cuh +++ b/cpp/include/raft/neighbors/epsilon_neighborhood.cuh @@ -60,7 +60,22 @@ void epsUnexpL2SqNeighborhood(bool* adj, } /** - * @brief Computes epsilon neighborhood for the L2-Squared distance metric + * @brief Computes epsilon neighborhood for the L2-Squared distance metric and given ball size. + * The epsilon neighbors is represented by a dense boolean adjacency matrix of size m * n and + * an array of degrees for each vertex, which can be used as a compressed sparse row (CSR) + * indptr array. + * + * @code{.cpp} + * #include + * #include + * #include + * using namespace raft::neighbors; + * raft::handle_t handle; + * ... + * auto adj = raft::make_device_matrix(handle, m * n); + * auto vd = raft::make_device_vector(handle, m+1); + * epsilon_neighborhood::eps_neighbors_l2sq(handle, x, y, adj.view(), vd.view(), eps); + * @endcode * * @tparam value_t IO and math type * @tparam idx_t Index type diff --git a/cpp/include/raft/neighbors/ivf_flat.cuh b/cpp/include/raft/neighbors/ivf_flat.cuh index 23ae6c42bf..87400a9b93 100644 --- a/cpp/include/raft/neighbors/ivf_flat.cuh +++ b/cpp/include/raft/neighbors/ivf_flat.cuh @@ -38,7 +38,7 @@ namespace raft::neighbors::ivf_flat { * * Usage example: * @code{.cpp} - * using namespace raft::spatial::knn; + * using namespace raft::neighbors; * // use default index parameters * ivf_flat::index_params index_params; * // create and fill the index from a [N, D] dataset @@ -61,7 +61,7 @@ namespace raft::neighbors::ivf_flat { * @return the constructed ivf-flat index */ template -inline auto build( +auto build( const handle_t& handle, const index_params& params, const T* dataset, IdxT n_rows, uint32_t dim) -> index { @@ -78,15 +78,15 @@ inline auto build( * * Usage example: * @code{.cpp} - * using namespace raft::spatial::knn; + * using namespace raft::neighbors; * // use default index parameters * ivf_flat::index_params index_params; * // create and fill the index from a [N, D] dataset - * auto index = ivf_flat::build(handle, index_params, dataset, N, D); + * auto index = ivf_flat::build(handle, dataset, index_params); * // use default search parameters * ivf_flat::search_params search_params; * // search K nearest neighbours for each of the N queries - * ivf_flat::search(handle, search_params, index, queries, N, K, out_inds, out_dists); + * ivf_flat::search(handle, index, queries, out_inds, out_dists, search_params, k); * @endcode * * @tparam value_t data element type @@ -101,9 +101,9 @@ inline auto build( * @return the constructed ivf-flat index */ template -auto build_index(const handle_t& handle, - raft::device_matrix_view dataset, - const index_params& params) -> index +auto build(const handle_t& handle, + raft::device_matrix_view dataset, + const index_params& params) -> index { return raft::spatial::knn::ivf_flat::detail::build(handle, params, @@ -145,11 +145,11 @@ auto build_index(const handle_t& handle, * @return the constructed extended ivf-flat index */ template -inline auto extend(const handle_t& handle, - const index& orig_index, - const T* new_vectors, - const IdxT* new_indices, - IdxT n_rows) -> index +auto extend(const handle_t& handle, + const index& orig_index, + const T* new_vectors, + const IdxT* new_indices, + IdxT n_rows) -> index { return raft::spatial::knn::ivf_flat::detail::extend( handle, orig_index, new_vectors, new_indices, n_rows); @@ -169,9 +169,9 @@ inline auto extend(const handle_t& handle, * index_params.add_data_on_build = false; // don't populate index on build * index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training * // train the index from a [N, D] dataset - * auto index_empty = ivf_flat::build(handle, index_params, dataset, N, D); + * auto index_empty = ivf_flat::build(handle, dataset, index_params, dataset); * // fill the index with the data - * auto index = ivf_flat::extend(handle, index_empty, dataset, nullptr, N); + * auto index = ivf_flat::extend(handle, index_empty, dataset); * @endcode * * @tparam value_t data element type @@ -204,8 +204,20 @@ auto extend(const handle_t& handle, } /** - * @brief Extend the index with the new data. - * * + * @brief Extend the index in-place with the new data. + * + * Usage example: + * @code{.cpp} + * using namespace raft::spatial::knn; + * ivf_flat::index_params index_params; + * index_params.add_data_on_build = false; // don't populate index on build + * index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training + * // train the index from a [N, D] dataset + * auto index_empty = ivf_flat::build(handle, index_params, dataset, N, D); + * // fill the index with the data + * ivf_flat::extend(handle, index_empty, dataset, nullptr, N); + * @endcode + * * @tparam T data element type * @tparam IdxT type of the indices in the source dataset * @@ -218,18 +230,30 @@ auto extend(const handle_t& handle, * @param[in] n_rows the number of samples */ template -inline void extend(const handle_t& handle, - index* index, - const T* new_vectors, - const IdxT* new_indices, - IdxT n_rows) +void extend(const handle_t& handle, + index* index, + const T* new_vectors, + const IdxT* new_indices, + IdxT n_rows) { *index = extend(handle, *index, new_vectors, new_indices, n_rows); } /** - * @brief Extend the index with the new data. - * * + * @brief Extend the index in-place with the new data. + * + * Usage example: + * @code{.cpp} + * using namespace raft::spatial::knn; + * ivf_flat::index_params index_params; + * index_params.add_data_on_build = false; // don't populate index on build + * index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training + * // train the index from a [N, D] dataset + * auto index_empty = ivf_flat::build(handle, dataset, index_params, dataset); + * // fill the index with the data + * ivf_flat::extend(handle, index_empty, dataset); + * @endcode + * * @tparam value_t data element type * @tparam idx_t type of the indices in the source dataset * @tparam int_t precision / type of integral arguments @@ -298,15 +322,15 @@ void extend(const handle_t& handle, * enough memory pool here to avoid memory allocations within search). */ template -inline void search(const handle_t& handle, - const search_params& params, - const index& index, - const T* queries, - uint32_t n_queries, - uint32_t k, - IdxT* neighbors, - float* distances, - rmm::mr::device_memory_resource* mr = nullptr) +void search(const handle_t& handle, + const search_params& params, + const index& index, + const T* queries, + uint32_t n_queries, + uint32_t k, + IdxT* neighbors, + float* distances, + rmm::mr::device_memory_resource* mr = nullptr) { return raft::spatial::knn::ivf_flat::detail::search( handle, params, index, queries, n_queries, k, neighbors, distances, mr); @@ -323,21 +347,15 @@ inline void search(const handle_t& handle, * eliminate entirely allocations happening within `search`: * @code{.cpp} * ... - * // Create a pooling memory resource with a pre-defined initial size. - * rmm::mr::pool_memory_resource mr( - * rmm::mr::get_current_device_resource(), 1024 * 1024); * // use default search parameters * ivf_flat::search_params search_params; * // Use the same allocator across multiple searches to reduce the number of * // cuda memory allocations - * ivf_flat::search(handle, search_params, index, queries1, N1, K, out_inds1, out_dists1, &mr); - * ivf_flat::search(handle, search_params, index, queries2, N2, K, out_inds2, out_dists2, &mr); - * ivf_flat::search(handle, search_params, index, queries3, N3, K, out_inds3, out_dists3, &mr); + * ivf_flat::search(handle, index, queries1, out_inds1, out_dists1, search_params, K); + * ivf_flat::search(handle, index, queries2, out_inds2, out_dists2, search_params, K); + * ivf_flat::search(handle, index, queries3, out_inds3, out_dists3, search_params, K); * ... * @endcode - * The exact size of the temporary buffer depends on multiple factors and is an implementation - * detail. However, you can safely specify a small initial size for the memory pool, so that only a - * few allocations happen to grow it during the first invocations of the `search`. * * @tparam value_t data element type * @tparam idx_t type of the indices diff --git a/cpp/include/raft/neighbors/ivf_pq.cuh b/cpp/include/raft/neighbors/ivf_pq.cuh index 1e32d5d7ba..5d619c5bec 100644 --- a/cpp/include/raft/neighbors/ivf_pq.cuh +++ b/cpp/include/raft/neighbors/ivf_pq.cuh @@ -37,7 +37,7 @@ namespace raft::neighbors::ivf_pq { * * Usage example: * @code{.cpp} - * using namespace raft::spatial::knn; + * using namespace raft::neighbors; * // use default index parameters * ivf_pq::index_params index_params; * // create and fill the index from a [N, D] dataset diff --git a/cpp/include/raft/solver/linear_assignment.cuh b/cpp/include/raft/solver/linear_assignment.cuh index 4c24dcbc29..3e17b557f2 100644 --- a/cpp/include/raft/solver/linear_assignment.cuh +++ b/cpp/include/raft/solver/linear_assignment.cuh @@ -39,8 +39,19 @@ namespace raft::solver { +/** + * @brief CUDA Implementation of O(n^3) alternating tree Hungarian Algorithm + * @note This is a port to RAFT from original authors Ketan Date and Rakesh Nagi + * + * @see Date, Ketan, and Rakesh Nagi. "GPU-accelerated Hungarian algorithms + * for the Linear Assignment Problem." Parallel Computing 57 (2016): 52-72. + * + * @tparam vertex_t + * @tparam weight_t + */ template class LinearAssignmentProblem { + private: vertex_t size_; vertex_t batchsize_; weight_t epsilon_; @@ -66,6 +77,13 @@ class LinearAssignmentProblem { rmm::device_uvector obj_val_dual_v; public: + /** + * @brief Constructor + * @param handle raft handle for managing resources + * @param size size of square matrix + * @param batchsize + * @param epsilon + */ LinearAssignmentProblem(raft::handle_t const& handle, vertex_t size, vertex_t batchsize, @@ -91,7 +109,12 @@ class LinearAssignmentProblem { { } - // Executes Hungarian algorithm on the input cost matrix. + /** + * Executes Hungarian algorithm on the input cost matrix. + * @param d_cost_matrix + * @param d_row_assignment + * @param d_col_assignment + */ void solve(weight_t const* d_cost_matrix, vertex_t* d_row_assignment, vertex_t* d_col_assignment) { initializeDevice(); @@ -118,19 +141,31 @@ class LinearAssignmentProblem { d_costs_ = nullptr; } - // Function for getting optimal row dual vector for subproblem spId. + /** + * Function for getting optimal row dual vector for subproblem spId. + * @param spId + * @return + */ std::pair getRowDualVector(int spId) const { return std::make_pair(row_duals_v.data() + spId * size_, size_); } - // Function for getting optimal col dual vector for subproblem spId. + /** + * Function for getting optimal col dual vector for subproblem spId. + * @param spId + * @return + */ std::pair getColDualVector(int spId) { return std::make_pair(col_duals_v.data() + spId * size_, size_); } - // Function for getting optimal primal objective value for subproblem spId. + /** + * Function for getting optimal primal objective value for subproblem spId. + * @param spId + * @return + */ weight_t getPrimalObjectiveValue(int spId) { weight_t result; @@ -139,7 +174,11 @@ class LinearAssignmentProblem { return result; } - // Function for getting optimal dual objective value for subproblem spId. + /** + * Function for getting optimal dual objective value for subproblem spId. + * @param spId + * @return + */ weight_t getDualObjectiveValue(int spId) { weight_t result; diff --git a/cpp/include/raft/sparse/solver/mst.cuh b/cpp/include/raft/sparse/solver/mst.cuh index 33beeb1915..5f55a567ca 100644 --- a/cpp/include/raft/sparse/solver/mst.cuh +++ b/cpp/include/raft/sparse/solver/mst.cuh @@ -20,6 +20,29 @@ namespace raft::sparse::solver { +/** + * Compute the minimium spanning tree (MST) or minimum spanning forest (MSF) depending on + * the connected components of the given graph. + * + * @tparam vertex_t integral type for precision of vertex indexing + * @tparam edge_t integral type for precision of edge indexing + * @tparam weight_t type of weights array + * @tparam alteration_t type to use for random alteration + * + * @param handle + * @param offsets csr inptr array of row offsets (size v+1) + * @param indices csr array of column indices (size e) + * @param weights csr array of weights (size e) + * @param v number of vertices in graph + * @param e number of edges in graph + * @param color array to store resulting colors for MSF + * @param stream cuda stream for ordering operations + * @param symmetrize_output should the resulting output edge list should be symmetrized? + * @param initialize_colors should the colors array be initialized inside the MST? + * @param iterations maximum number of iterations to perform + * @return a list of edges containing the mst (or a subset of the edges guaranteed to be in the mst + * when an msf is encountered) + */ template Graph_COO mst(const raft::handle_t& handle, edge_t const* offsets, diff --git a/cpp/include/raft/spatial/knn/ivf_flat.cuh b/cpp/include/raft/spatial/knn/ivf_flat.cuh index d7c3d80fb5..65b6f5ed4b 100644 --- a/cpp/include/raft/spatial/knn/ivf_flat.cuh +++ b/cpp/include/raft/spatial/knn/ivf_flat.cuh @@ -33,7 +33,6 @@ namespace raft::spatial::knn::ivf_flat { using raft::neighbors::ivf_flat::build; -using raft::neighbors::ivf_flat::build_index; using raft::neighbors::ivf_flat::extend; using raft::neighbors::ivf_flat::search; diff --git a/cpp/include/raft/stats/adjusted_rand_index.cuh b/cpp/include/raft/stats/adjusted_rand_index.cuh index e1b6a241c4..93fd07eb0b 100644 --- a/cpp/include/raft/stats/adjusted_rand_index.cuh +++ b/cpp/include/raft/stats/adjusted_rand_index.cuh @@ -31,8 +31,8 @@ namespace raft { namespace stats { /** - * @brief Function to calculate Adjusted RandIndex as described - * here + * @brief Function to calculate Adjusted RandIndex + * @see https://en.wikipedia.org/wiki/Rand_index * @tparam T data-type for input label arrays * @tparam MathT integral data-type used for computing n-choose-r * @param firstClusterArray: the array of classes @@ -50,8 +50,8 @@ double adjusted_rand_index(const T* firstClusterArray, } /** - * @brief Function to calculate Adjusted RandIndex as described - * here + * @brief Function to calculate Adjusted RandIndex + * @see https://en.wikipedia.org/wiki/Rand_index * @tparam value_t data-type for input label arrays * @tparam math_t integral data-type used for computing n-choose-r * @tparam idx_t Index type of matrix extent. diff --git a/cpp/include/raft/stats/common.hpp b/cpp/include/raft/stats/common.hpp index 8392bd50fe..724ca224c6 100644 --- a/cpp/include/raft/stats/common.hpp +++ b/cpp/include/raft/stats/common.hpp @@ -1,5 +1,5 @@ /* - * Copyright (c) 2019-2022, NVIDIA CORPORATION. + * Copyright (c) 2022, NVIDIA CORPORATION. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -13,59 +13,10 @@ * See the License for the specific language governing permissions and * limitations under the License. */ - #pragma once -#include - -// This file is a shameless amalgamation of independent works done by -// Lars Nyland and Andy Adinets - -///@todo: add cub's histogram as another option - -namespace raft { -namespace stats { - -/** Default mapper which just returns the value of the data itself */ -template -struct IdentityBinner { - DI int operator()(DataT val, IdxT row, IdxT col) { return int(val); } -}; - -/** Types of support histogram implementations */ -enum HistType { - /** shared mem atomics but with bins to be 1b int's */ - HistTypeSmemBits1 = 1, - /** shared mem atomics but with bins to be 2b int's */ - HistTypeSmemBits2 = 2, - /** shared mem atomics but with bins to be 4b int's */ - HistTypeSmemBits4 = 4, - /** shared mem atomics but with bins to ba 1B int's */ - HistTypeSmemBits8 = 8, - /** shared mem atomics but with bins to be 2B int's */ - HistTypeSmemBits16 = 16, - /** use only global atomics */ - HistTypeGmem, - /** uses shared mem atomics to reduce global traffic */ - HistTypeSmem, - /** - * uses shared mem atomics with match_any intrinsic to further reduce shared - * memory traffic. This can only be enabled on Volta and later architectures. - * If one tries to enable this for older arch's, it will fall back to - * `HistTypeSmem`. - * @note This is to be used only when the input dataset leads to a lot of - * repetitions in a given warp, else, this algo can be much slower than - * `HistTypeSmem`! - */ - HistTypeSmemMatchAny, - /** builds a hashmap of active bins in shared mem */ - HistTypeSmemHash, - /** decide at runtime the best algo for the given inputs */ - HistTypeAuto -}; - -/// Supported types of information criteria -enum IC_Type { AIC, AICc, BIC }; +#pragma message(__FILE__ \ + " is deprecated and will be removed in a future release." \ + " Please use the raft/stats/stats_types.hpp version instead.") -}; // end namespace stats -}; // end namespace raft +#include diff --git a/cpp/include/raft/stats/detail/histogram.cuh b/cpp/include/raft/stats/detail/histogram.cuh index 777e0b7816..69bd721ded 100644 --- a/cpp/include/raft/stats/detail/histogram.cuh +++ b/cpp/include/raft/stats/detail/histogram.cuh @@ -32,6 +32,12 @@ namespace raft { namespace stats { namespace detail { +/** Default mapper which just returns the value of the data itself */ +template +struct IdentityBinner { + DI int operator()(DataT val, IdxT row, IdxT col) { return int(val); } +}; + static const int ThreadsPerBlock = 256; template diff --git a/cpp/include/raft/stats/histogram.cuh b/cpp/include/raft/stats/histogram.cuh index df1c2772f1..8efb2e8df8 100644 --- a/cpp/include/raft/stats/histogram.cuh +++ b/cpp/include/raft/stats/histogram.cuh @@ -31,6 +31,14 @@ namespace raft { namespace stats { +/** + * Default mapper which just returns the value of the data itself + */ +template +struct IdentityBinner : public detail::IdentityBinner { + IdentityBinner() : detail::IdentityBinner() {} +}; + /** * @brief Perform histogram on the input data. It chooses the right load size * based on the input data vector length. It also supports large-bin cases diff --git a/cpp/include/raft/stats/stats_types.hpp b/cpp/include/raft/stats/stats_types.hpp new file mode 100644 index 0000000000..5db5ef1c57 --- /dev/null +++ b/cpp/include/raft/stats/stats_types.hpp @@ -0,0 +1,62 @@ +/* + * Copyright (c) 2019-2022, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#pragma once + +#include + +namespace raft::stats { + +/** + * @brief Types of support histogram implementations + */ +enum HistType { + /** shared mem atomics but with bins to be 1b int's */ + HistTypeSmemBits1 = 1, + /** shared mem atomics but with bins to be 2b int's */ + HistTypeSmemBits2 = 2, + /** shared mem atomics but with bins to be 4b int's */ + HistTypeSmemBits4 = 4, + /** shared mem atomics but with bins to ba 1B int's */ + HistTypeSmemBits8 = 8, + /** shared mem atomics but with bins to be 2B int's */ + HistTypeSmemBits16 = 16, + /** use only global atomics */ + HistTypeGmem, + /** uses shared mem atomics to reduce global traffic */ + HistTypeSmem, + /** + * uses shared mem atomics with match_any intrinsic to further reduce shared + * memory traffic. This can only be enabled on Volta and later architectures. + * If one tries to enable this for older arch's, it will fall back to + * `HistTypeSmem`. + * @note This is to be used only when the input dataset leads to a lot of + * repetitions in a given warp, else, this algo can be much slower than + * `HistTypeSmem`! + */ + HistTypeSmemMatchAny, + /** builds a hashmap of active bins in shared mem */ + HistTypeSmemHash, + /** decide at runtime the best algo for the given inputs */ + HistTypeAuto +}; + +/** + * @brief Supported types of information criteria + */ +enum IC_Type { AIC, AICc, BIC }; + +}; // end namespace raft::stats diff --git a/cpp/test/neighbors/ann_ivf_flat.cu b/cpp/test/neighbors/ann_ivf_flat.cu index 01af7ea0bd..3a5daff4bb 100644 --- a/cpp/test/neighbors/ann_ivf_flat.cu +++ b/cpp/test/neighbors/ann_ivf_flat.cu @@ -154,7 +154,7 @@ class AnnIVFFlatTest : public ::testing::TestWithParam> { auto database_view = raft::make_device_matrix_view( (const DataT*)database.data(), ps.num_db_vecs, ps.dim); - auto index = ivf_flat::build_index(handle_, database_view, index_params); + auto index = ivf_flat::build(handle_, database_view, index_params); rmm::device_uvector vector_indices(ps.num_db_vecs, stream_); thrust::sequence(handle_.get_thrust_policy(), diff --git a/cpp/test/neighbors/knn.cu b/cpp/test/neighbors/knn.cu index 710950e312..eb5ecf663f 100644 --- a/cpp/test/neighbors/knn.cu +++ b/cpp/test/neighbors/knn.cu @@ -94,7 +94,8 @@ class KNNTest : public ::testing::TestWithParam { auto distances = raft::make_device_matrix_view(distances_.data(), rows_, k_); - knn(handle, index, search, indices, distances, k_); + auto metric = raft::distance::DistanceType::L2Unexpanded; + knn(handle, index, search, indices, distances, k_, metric, std::make_optional(0)); build_actual_output<<>>( actual_labels_.data(), rows_, k_, search_labels_.data(), indices_.data()); diff --git a/docs/README.md b/docs/README.md index ced8e63938..a09ccf41eb 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,7 +1,7 @@ # Building Documentation ## Building locally: -#### [Build and install RAFT](../BUILD.md) +#### [Build and install RAFT](source/build.md) #### Generate the docs ```shell script diff --git a/BUILD.md b/docs/source/build.md similarity index 88% rename from BUILD.md rename to docs/source/build.md index d38db90249..b75e67d82f 100644 --- a/BUILD.md +++ b/docs/source/build.md @@ -1,38 +1,21 @@ -# RAFT Build and Development Guide - -- [Building and installing RAFT](#build_install) - - [CUDA/GPU Requirements](#cuda_gpu_req) - - [Build Dependencies](#required_depenencies) - - [Header-only C++](#install_header_only_cpp) - - [C++ Shared Libraries](#shared_cpp_libs) - - [Improving Rebuild Times](#ccache) - - [Googletests](#gtests) - - [Googlebench](#gbench) - - [C++ Using Cmake](#cpp_using_cmake) - - [Python](#python) - - [Documentation](#docs) -- [Using RAFT in downstream projects](#use_raft) - - [Cmake Header-only Integration](#cxx_integration) - - [Using Shared Libraries in Cmake](#use_shared_libs) - - [Building RAFT C++ from source](#build_cxx_source) - - [Python/Cython Integration](#py_integration) - -## Building and installing RAFT - -### CUDA/GPU Requirements +# Install Guide + +## Building and installing RAFT + +### CUDA/GPU Requirements - CUDA Toolkit 11.0+ - NVIDIA driver 450.80.02+ - Pascal architecture of better (compute capability >= 6.0) -### Build Dependencies +### Build Dependencies In addition to the libraries included with cudatoolkit 11.0+, there are some other dependencies below for building RAFT from source. Many of the dependencies are optional and depend only on the primitives being used. All of these can be installed with cmake or [rapids-cpm](https://github.com/rapidsai/rapids-cmake#cpm) and many of them can be installed with [conda](https://anaconda.org). #### Required - [RMM](https://github.com/rapidsai/rmm) corresponding to RAFT version. +- [Thrust](https://github.com/NVIDIA/thrust) v1.17 / [CUB](https://github.com/NVIDIA/cub) #### Optional -- [Thrust](https://github.com/NVIDIA/thrust) v1.15 / [CUB](https://github.com/NVIDIA/cub) - On by default but can be disabled. - [cuCollections](https://github.com/NVIDIA/cuCollections) - Used in `raft::sparse::distance` API. - [Libcu++](https://github.com/NVIDIA/libcudacxx) v1.7.0 - [FAISS](https://github.com/facebookresearch/faiss) v1.7.0 - Used in `raft::spatial::knn` API and needed to build tests. @@ -46,7 +29,7 @@ C++ RAFT is a header-only library but provides the option of building shared lib The recommended way to build and install RAFT is to use the `build.sh` script in the root of the repository. This script can build both the C++ and Python artifacts and provides options for building and installing the headers, tests, benchmarks, and individual shared libraries. -### Header-only C++ +### Header-only C++ `build.sh` uses [rapids-cmake](https://github.com/rapidsai/rapids-cmake), which will automatically download any dependencies which are not already installed. It's important to note that while all the headers will be installed and available, some parts of the RAFT API depend on libraries like `FAISS`, which will need to be explicitly enabled in `build.sh`. @@ -55,7 +38,7 @@ The following example will download the needed dependencies and install the RAFT ./build.sh libraft --install ``` -### C++ Shared Libraries (optional) +### C++ Shared Libraries (optional) For larger projects which make heavy use of the pairwise distances or nearest neighbors APIs, shared libraries can be built to speed up compile times. These shared libraries can also significantly improve re-compile times both while developing RAFT and developing against the APIs. Build all of the available shared libraries by passing `--compile-libs` flag to `build.sh`: ```bash @@ -69,7 +52,7 @@ Individual shared libraries have their own flags and multiple can be used (thoug Add the `--install` flag to the above example to also install the shared libraries into `$INSTALL_PREFIX/lib`. -### `ccache` and `sccache` +### ccache and sccache `ccache` and `sccache` can be used to better cache parts of the build when rebuilding frequently, such as when working on a new feature. You can also use `ccache` or `sccache` with `build.sh`: @@ -77,7 +60,7 @@ Add the `--install` flag to the above example to also install the shared librari ./build.sh libraft --cache-tool=ccache ``` -### Tests +### Tests Compile the tests using the `tests` target in `build.sh`. @@ -104,7 +87,7 @@ It can take sometime to compile all of the tests. You can build individual tests ./build.sh libraft tests --limit-tests=NEIGHBORS_TEST;DISTANCE_TEST;MATRIX_TEST ``` -### Benchmarks +### Benchmarks The benchmarks are broken apart by algorithm category, so you will find several binaries in `cpp/build/` named `*_BENCH`. ```bash @@ -117,7 +100,7 @@ It can take sometime to compile all of the benchmarks. You can build individual ./build.sh libraft bench --limit-bench=NEIGHBORS_BENCH;DISTANCE_BENCH;LINALG_BENCH ``` -### C++ Using Cmake +### C++ Using Cmake Use `CMAKE_INSTALL_PREFIX` to install RAFT into a specific location. The snippet below will install it into the current conda environment: ```bash @@ -139,7 +122,6 @@ RAFT's cmake has the following configurable flags available:. | RAFT_COMPILE_NN_LIBRARY | ON, OFF | OFF | Compiles the `libraft-nn` shared library | | RAFT_COMPILE_DIST_LIBRARY | ON, OFF | OFF | Compiles the `libraft-distance` shared library | | RAFT_ENABLE_NN_DEPENDENCIES | ON, OFF | OFF | Searches for dependencies of nearest neighbors API, such as FAISS, and compiles them if not found. Needed for `raft::spatial::knn` | -| RAFT_ENABLE_thrust_DEPENDENCY | ON, OFF | ON | Enables the Thrust dependency. This can be disabled when using many simple utilities or to override with a different Thrust version. | | RAFT_USE_FAISS_STATIC | ON, OFF | OFF | Statically link FAISS into `libraft-nn` | | RAFT_STATIC_LINK_LIBRARIES | ON, OFF | ON | Build static link libraries instead of shared libraries | | DETECT_CONDA_ENV | ON, OFF | ON | Enable detection of conda environment for dependencies | @@ -150,7 +132,7 @@ RAFT's cmake has the following configurable flags available:. Currently, shared libraries are provided for the `libraft-nn` and `libraft-distance` components. The `libraft-nn` component depends upon [FAISS](https://github.com/facebookresearch/faiss) and the `RAFT_ENABLE_NN_DEPENDENCIES` option will build it from source if it is not already installed. -### Python +### Python Conda environment scripts are provided for installing the necessary dependencies for building and using the Python APIs. It is preferred to use `mamba`, as it provides significant speedup over `conda`. In addition you will have to manually install `nvcc` as it will not be installed as part of the conda environment. The following example will install create and install dependencies for a CUDA 11.5 conda environment: @@ -189,9 +171,9 @@ cd python/pylibraft py.test -s -v ``` -### Documentation +### Documentation -The documentation requires that the C++ headers and python packages have been built and installed. +The documentation requires that the C++ headers and python packages have been built and installed. The following will build the docs along with the C++ and Python packages: @@ -201,11 +183,11 @@ The following will build the docs along with the C++ and Python packages: -## Using RAFT in downstream projects +## Using RAFT in downstream projects There are two different strategies for including RAFT in downstream projects, depending on whether or not the required dependencies are already installed and available on the `lib` and `include` paths. -### C++ header-only integration using cmake +### C++ header-only integration using cmake When the needed [build dependencies](#required_depenencies) are already satisfied, RAFT can be trivially integrated into downstream projects by cloning the repository and adding `cpp/include` from RAFT to the include path: ```cmake @@ -222,7 +204,7 @@ set(RAFT_INCLUDE_DIR ${RAFT_GIT_DIR}/raft/cpp/include CACHE STRING "RAFT include If RAFT has already been installed, such as by using the `build.sh` script, use `find_package(raft)` and the `raft::raft` target if using RAFT to interact only with the public APIs of consuming projects. -### Using pre-compiled shared libraries +### Using pre-compiled shared libraries Use `find_package(raft COMPONENTS nn distance)` to enable the shared libraries and transitively pass dependencies through separate targets for each component. In this example, the `raft::distance` and `raft::nn` targets will be available for configuring linking paths in addition to `raft::raft`. These targets will also pass through any transitive dependencies (such as FAISS for the `nn` package). @@ -234,7 +216,7 @@ The following example tells the compiler to ignore the pre-compiled templates fo #include ``` -### Building RAFT C++ from source in cmake +### Building RAFT C++ from source in cmake RAFT uses the [RAPIDS-CMake](https://github.com/rapidsai/rapids-cmake) library so it can be more easily included into downstream projects. RAPIDS cmake provides a convenience layer around the [CMake Package Manager (CPM)](https://github.com/cpm-cmake/CPM.cmake). @@ -324,6 +306,6 @@ find_and_configure_raft(VERSION ${RAFT_VERSION}.00 If using the nearest neighbors APIs without the shared libraries, set `ENABLE_NN_DEPENDENCIES=ON` and keep `USE_NN_LIBRARY=OFF` -### Python/Cython Integration +### Python/Cython Integration Once installed, RAFT's Python library can be added to downstream conda recipes, imported and used directly. diff --git a/docs/source/cpp_api.rst b/docs/source/cpp_api.rst index d10d9773a5..cf3829422d 100644 --- a/docs/source/cpp_api.rst +++ b/docs/source/cpp_api.rst @@ -1,6 +1,7 @@ -~~~~~~~~~~~~~~~~~~~~~~ -RAFT C++ API Reference -~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~ +C++ API Reference +~~~~~~~~~~~~~~~~~ + .. _api: @@ -13,6 +14,7 @@ RAFT C++ API Reference cpp_api/distance.rst cpp_api/linalg.rst cpp_api/matrix.rst + cpp_api/mdspan.rst cpp_api/neighbors.rst cpp_api/solver.rst cpp_api/random.rst diff --git a/docs/source/cpp_api/cluster.rst b/docs/source/cpp_api/cluster.rst index 41816482cc..90c430ace9 100644 --- a/docs/source/cpp_api/cluster.rst +++ b/docs/source/cpp_api/cluster.rst @@ -1,7 +1,8 @@ Cluster ======= -This page provides C++ class references for the publicly-exposed elements of the cluster package. +This page provides C++ class references for the publicly-exposed elements of the `raft/cluster` headers. RAFT provides +fundamental clustering algorithms which are, themselves, considered reusable building blocks for other algorithms. K-Means ------- @@ -24,4 +25,4 @@ Spectral Clustering .. doxygennamespace:: raft::spectral :project: RAFT - :members: \ No newline at end of file + :members: diff --git a/docs/source/cpp_api/core.rst b/docs/source/cpp_api/core.rst index d4891bf0b3..9e4ef412f7 100644 --- a/docs/source/cpp_api/core.rst +++ b/docs/source/cpp_api/core.rst @@ -1,8 +1,10 @@ Core ==== -This page provides C++ class references for the publicly-exposed elements of the core package. - +This page provides C++ class references for the publicly-exposed elements of the `raft/core` package. The `raft/core` headers +require minimal dependencies, can be compiled without `nvcc`, and thus are safe to expose on your own public APIs. Aside from +the headers in the `raft/core` include directory, any headers in the codebase with the suffix `_types.hpp` are also safe to +expose in public APIs. handle_t ######## @@ -12,7 +14,7 @@ handle_t :members: -interruptible +Interruptible ############# .. doxygenclass:: raft::interruptible @@ -27,71 +29,10 @@ NVTX :members: -mdarray -####### - -.. doxygenclass:: raft::mdarray - :project: RAFT - :members: - -.. doxygenclass:: raft::make_device_matrix - :project: RAFT - -.. doxygenclass:: raft::make_device_vector - :project: RAFT - -.. doxygenclass:: raft::make_device_scalar - :project: RAFT - -.. doxygenclass:: raft::make_host_matrix - :project: RAFT - -.. doxygenclass:: raft::make_host_vector - :project: RAFT - -.. doxygenclass:: raft::make_device_scalar - :project: RAFT - - -mdspan -####### - -.. doxygenfunction:: raft::make_device_mdspan - :project: RAFT - -.. doxygenfunction:: raft::make_device_matrix_view - :project: RAFT - -.. doxygenfunction:: raft::make_device_vector_view - :project: RAFT - -.. doxygenfunction:: raft::make_device_scalar_view - :project: RAFT - -.. doxygenfunction:: raft::make_host_matrix_view - :project: RAFT - -.. doxygenfunction:: raft::make_host_vector_view - :project: RAFT - -.. doxygenfunction:: raft::make_device_scalar_view - :project: RAFT - -span -#### - -.. doxygenclass:: raft::device_span - :project: RAFT - :members: - -.. doxygenclass:: raft::host_span - :project: RAFT - :members: - Key-Value Pair ############## -.. doxygenclass:: raft::KeyValuePair +.. doxygenstruct:: raft::KeyValuePair :project: RAFT :members: diff --git a/docs/source/cpp_api/distance.rst b/docs/source/cpp_api/distance.rst index c2bce860d5..2596361f6a 100644 --- a/docs/source/cpp_api/distance.rst +++ b/docs/source/cpp_api/distance.rst @@ -1,7 +1,8 @@ Distance ======== -This page provides C++ class references for the publicly-exposed elements of the distance package. +This page provides C++ class references for the publicly-exposed elements of the `raft/distance` package. RAFT's +distances have been highly optimized and support a wide assortment of different distance measures. Distance ######## diff --git a/docs/source/cpp_api/linalg.rst b/docs/source/cpp_api/linalg.rst index f9986fd2ce..5664e5b3dc 100644 --- a/docs/source/cpp_api/linalg.rst +++ b/docs/source/cpp_api/linalg.rst @@ -1,7 +1,10 @@ Linear Algebra ============== -This page provides C++ class references for the publicly-exposed elements of the (dense) linear algebra package. +This page provides C++ class references for the publicly-exposed elements of the `raft/linalg` (dense) linear algebra headers. +In addition to providing highly optimized arithmetic and matrix/vector operations, RAFT provides a consistent user experience +by providing common BLAS routines, standard linear system solvers, factorization and eigenvalue solvers. Some of these routines +hide the complexities of lower-level C-based libraries provided in the CUDA toolkit .. doxygennamespace:: raft::linalg :project: RAFT diff --git a/docs/source/cpp_api/matrix.rst b/docs/source/cpp_api/matrix.rst index 65534aa6ee..945658eb7b 100644 --- a/docs/source/cpp_api/matrix.rst +++ b/docs/source/cpp_api/matrix.rst @@ -1,7 +1,8 @@ Matrix ====== -This page provides C++ class references for the publicly-exposed elements of the matrix package. +This page provides C++ class references for the publicly-exposed elements of the `raft/matrix` headers. The `raft/matrix` +headers cover many operations on matrices that are otherwise not covered by `raft/linalg`. .. doxygennamespace:: raft::matrix :project: RAFT diff --git a/docs/source/cpp_api/mdspan.rst b/docs/source/cpp_api/mdspan.rst new file mode 100644 index 0000000000..a283da967b --- /dev/null +++ b/docs/source/cpp_api/mdspan.rst @@ -0,0 +1,344 @@ +Multi-dimensional Span / Array +============================== + +This page provides C++ class references for the RAFT's 1d span and multi-dimension owning (mdarray) and non-owning (mdspan) APIs. These headers can be found in the `raft/core` directory. + +Representation +############## + +.. doxygenstruct:: raft::host_device_accessor + :project: RAFT + :members: + +.. doxygentypedef:: raft::host_accessor + :project: RAFT + +.. doxygentypedef:: raft::device_accessor + :project: RAFT + +.. doxygentypedef:: raft::managed_accessor + :project: RAFT + +.. doxygentypedef:: raft::row_major + :project: RAFT + +.. doxygentypedef:: raft::col_major + :project: RAFT + +.. doxygentypedef:: raft::matrix_extent + :project: RAFT + +.. doxygentypedef:: raft::vector_extent + :project: RAFT + +.. doxygentypedef:: raft::scalar_extent + :project: RAFT + +.. doxygentypedef:: raft::extent_3d + :project: RAFT + +.. doxygentypedef:: raft::extent_4d + :project: RAFT + +.. doxygentypedef:: raft::extent_5d + :project: RAFT + +.. doxygentypedef:: raft::dynamic_extent + :project: RAFT + +.. doxygentypedef:: raft::extents + :project: RAFT + +.. doxygenfunction:: raft::flatten + :project: RAFT + + +.. doxygenfunction:: raft::reshape + :project: RAFT + + +mdarray +####### + +.. doxygenclass:: raft::mdarray + :project: RAFT + :members: + +.. doxygenclass:: raft::array_interface + :project: RAFT + :members: + +.. doxygenstruct:: raft::is_array_interface + :project: RAFT + :members: + +.. doxygentypedef:: raft::is_array_interface_t + :project RAFT + +Device Vocabulary +----------------- + +.. doxygentypedef:: raft::device_mdarray + :project: RAFT + + +.. doxygentypedef:: raft::device_matrix + :project: RAFT + +.. doxygentypedef:: raft::device_vector + :project: RAFT + +.. doxygentypedef:: raft::device_scalar + :project: RAFT + + +Device Factories +---------------- + +.. doxygenfunction:: raft::make_device_matrix + :project: RAFT + +.. doxygenfunction:: raft::make_device_vector + :project: RAFT + +.. doxygenfunction:: raft::make_device_scalar + :project: RAFT + + +Host Vocabulary +--------------- + +.. doxygentypedef:: raft::host_matrix + :project: RAFT + +.. doxygentypedef:: raft::host_vector + :project: RAFT + +.. doxygentypedef:: raft::host_scalar + :project: RAFT + + +Host Factories +-------------- + +.. doxygenfunction:: raft::make_host_matrix + :project: RAFT + +.. doxygenfunction:: raft::make_host_vector + :project: RAFT + +.. doxygenfunction:: raft::make_device_scalar + :project: RAFT + +mdspan +###### + +.. doxygentypedef:: raft::mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_mdspan + :project: RAFT + :members: + +.. doxygentypedef:: raft::is_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_input_mdspan + :project: RAFT + :members: + +.. doxygentypedef:: raft::is_input_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_output_mdspan + :project: RAFT + :members: + +.. doxygentypedef:: raft::is_output_mdspan_t + :project: RAFT + +.. doxygentypedef:: raft::enable_if_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_input_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_output_mdspan + :project: RAFT + +.. doxygenfunction:: raft::make_mdspan + :project: RAFT + +.. doxygenfunction:: raft::make_extents + :project: RAFT + +.. doxygenfunction:: raft::unravel_index + :project: RAFT + + +Device Vocabulary +----------------- + +.. doxygentypedef:: raft::device_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_device_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_device_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_input_device_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_output_device_mdspan_t + :project: RAFT + +.. doxygentypedef:: raft::enable_if_device_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_input_device_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_output_device_mdspan + :project: RAFT + +.. doxygentypedef:: raft::device_matrix_view + :project: RAFT + +.. doxygentypedef:: raft::device_vector_view + :project: RAFT + +.. doxygentypedef:: raft::device_scalar_view + :project: RAFT + + +Device Factories +---------------- + +.. doxygenfunction:: raft::make_device_mdspan + :project: RAFT + +.. doxygenfunction:: raft::make_device_matrix_view + :project: RAFT + +.. doxygenfunction:: raft::make_device_vector_view + :project: RAFT + +.. doxygenfunction:: raft::make_device_scalar_view + :project: RAFT + + +Managed Vocabulary +------------------ + +..doxygentypedef:: raft::managed_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_managed_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_managed_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_input_managed_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_output_managed_mdspan_t + :project: RAFT + +.. doxygentypedef:: raft::enable_if_managed_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_input_managed_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_output_managed_mdspan + :project: RAFT + +.. doxygentypedef:: raft::managed_matrix_view + :project: RAFT + +.. doxygentypedef:: raft::managed_vector_view + :project: RAFT + +.. doxygentypedef:: raft::managed_scalar_view + :project: RAFT + + +Managed Factories +----------------- + +.. doxygenfunction:: raft::make_managed_mdspan + :project: RAFT + +.. doxygenfunction:: raft::make_managed_matrix_view + :project: RAFT + +.. doxygenfunction:: raft::make_managed_vector_view + :project: RAFT + +.. doxygenfunction:: raft::make_managed_scalar_view + :project: RAFT + + +Host Vocabulary +--------------- + +.. doxygentypedef:: raft::host_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_host_mdspan + :project: RAFT + +.. doxygenstruct:: raft::is_host_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_input_host_mdspan_t + :project: RAFT + +.. doxygenstruct:: raft::is_output_host_mdspan_t + :project: RAFT + +.. doxygentypedef:: raft::enable_if_host_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_input_host_mdspan + :project: RAFT + +.. doxygentypedef:: raft::enable_if_output_host_mdspan + :project: RAFT + +.. doxygentypedef:: raft::host_matrix_view + :project: RAFT + +.. doxygentypedef:: raft::host_vector_view + :project: RAFT + +.. doxygentypedef:: raft::host_scalar_view + :project: RAFT + +Host Factories +-------------- + +.. doxygenfunction:: raft::make_host_matrix_view + :project: RAFT + +.. doxygenfunction:: raft::make_host_vector_view + :project: RAFT + +.. doxygenfunction:: raft::make_device_scalar_view + :project: RAFT + +span +#### + +.. doxygentypedef:: raft::device_span + :project: RAFT + +.. doxygentypedef:: raft::host_span + :project: RAFT + +.. doxygenclass:: raft::span + :project: RAFT + :members: diff --git a/docs/source/cpp_api/solver.rst b/docs/source/cpp_api/solver.rst index a8b93ca046..f7ca244dc8 100644 --- a/docs/source/cpp_api/solver.rst +++ b/docs/source/cpp_api/solver.rst @@ -1,7 +1,7 @@ -Optimization -============ +Solvers +======= -This page provides C++ class references for the publicly-exposed elements of the optimization package. +This page provides C++ class references for the publicly-exposed elements of the iterative and combinatorial solvers package. Linear Assignment Problem diff --git a/docs/source/cpp_api/sparse.rst b/docs/source/cpp_api/sparse.rst index c0ea61c6f7..a7c32cc65d 100644 --- a/docs/source/cpp_api/sparse.rst +++ b/docs/source/cpp_api/sparse.rst @@ -4,7 +4,6 @@ Sparse This page provides C++ class references for the publicly-exposed elements of the sparse package. - Conversion ########## @@ -26,20 +25,16 @@ Linear Algebra :project: RAFT :members: -Misc Operations -############### +Matrix Operations +################# .. doxygennamespace:: raft::sparse::op :project: RAFT :members: -Selection -######### - -.. doxygennamespace:: raft::sparse::selection - :project: RAFT - :members: +Nearest Neighbors +################# -.. doxygennamespace:: raft::linkage +.. doxygennamespace:: raft::sparse::neighbors :project: RAFT :members: diff --git a/docs/source/index.rst b/docs/source/index.rst index 0d7ab295f4..c46f08aac6 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,15 +1,49 @@ Welcome to RAFT's documentation! ================================= -RAFT contains fundamental widely-used algorithms and primitives for data science and machine learning. +RAFT contains fundamental widely-used algorithms and primitives for scientific computing, data science and machine learning. The algorithms are CUDA-accelerated and form building-blocks for rapidly composing analytics. + +By taking a primitives-based approach to algorithm development, RAFT + +- accelerates algorithm construction time +- reduces the maintenance burden by maximizing reuse across projects, and +- centralizes core reusable computations, allowing future optimizations to benefit all algorithms that use them. + + +While not exhaustive, the following general categories help summarize the accelerated building blocks that RAFT contains: + +.. list-table:: + :widths: 25 50 + :header-rows: 1 + + * - Category + - Examples + * - Data Formats + - sparse & dense, conversions, data generation + * - Dense Operations + - linear algebra, matrix and vector operations, slicing, norms, factorization, least squares, svd & eigenvalue problems + * - Sparse Operations + - linear algebra, arithmetic, eigenvalue problems, slicing, symmetrization, components & labeling + * - Spatial + - pairwise distances, nearest neighbors, neighborhood graph construction + * - Basic Clustering + - spectral clustering, hierarchical clustering, k-means + * - Solvers + - combinatorial optimization, iterative solvers + * - Statistics + - sampling, moments and summary statistics, metrics + * - Tools & Utilities + - common utilities for developing CUDA applications, multi-node multi-gpu infrastructure .. toctree:: :maxdepth: 2 :caption: Contents: + quick_start.md + build.md cpp_api.rst - raft_dask_api.rst pylibraft_api.rst + raft_dask_api.rst Indices and tables diff --git a/docs/source/quick_start.md b/docs/source/quick_start.md new file mode 100644 index 0000000000..e73f9b8a7a --- /dev/null +++ b/docs/source/quick_start.md @@ -0,0 +1,128 @@ +# Quick Start + +This guide is meant to provide a quick-start tutorial for interacting with RAFT's C++ APIs. + +## RAPIDS Memory Manager (RMM) + +RAFT relies heavily on the [RMM](https://github.com/rapidsai/rmm) library which eases the burden of configuring different allocation strategies globally across the libraries that use it. + +## Multi-dimensional Spans and Arrays + +The APIs in RAFT currently accept raw pointers to device memory and we are in the process of simplifying the APIs with the [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory. + +The `mdarray` forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions: + +```c++ +#include + +int n_rows = 10; +int n_cols = 10; + +auto scalar = raft::make_device_scalar(handle, 1.0); +auto vector = raft::make_device_vector(handle, n_cols); +auto matrix = raft::make_device_matrix(handle, n_rows, n_cols); +``` + +The `mdspan` is a lightweight non-owning view that can wrap around any pointer, maintaining shape, layout, and indexing information for accessing elements. + + +We can construct `mdspan` instances directly from the above `mdarray` instances: + +```c++ +// Scalar mdspan on device +auto scalar_view = scalar.view(); + +// Vector mdspan on device +auto vector_view = vector.view(); + +// Matrix mdspan on device +auto matrix_view = matrix.view(); +``` +Since the `mdspan` is just a lightweight wrapper, we can also construct it from the underlying data handles in the `mdarray` instances above. We use the extent to get information about the `mdarray` or `mdspan`'s shape. + +```c++ +#include + +auto scalar_view = raft::make_device_scalar_view(scalar.data_handle()); +auto vector_view = raft::make_device_vector_view(vector.data_handle(), vector.extent(0)); +auto matrix_view = raft::make_device_matrix_view(matrix.data_handle(), matrix.extent(0), matrix.extent(1)); +``` + +Of course, RAFT's `mdspan`/`mdarray` APIs aren't just limited to the `device`. You can also create `host` variants: + +```c++ +#include +#include + +int n_rows = 10; +int n_cols = 10; + +auto scalar = raft::make_host_scalar(handle, 1.0); +auto vector = raft::make_host_vector(handle, n_cols); +auto matrix = raft::make_host_matrix(handle, n_rows, n_cols); + +auto scalar_view = raft::make_host_scalar_view(scalar.data_handle()); +auto vector_view = raft::make_host_vector_view(vector.data_handle(), vector.extent(0)); +auto matrix_view = raft::make_host_matrix_view(matrix.data_handle(), matrix.extent(0), matrix.extent(1)); +``` + +And `managed` variants: + +```c++ +#include + +int n_rows = 10; +int n_cols = 10; + +auto matrix = raft::make_managed_mdspan(managed_ptr, raft::make_matrix_extents(n_rows, n_cols)); +``` + + +## C++ Example + +Most of the primitives in RAFT accept a `raft::handle_t` object for the management of resources which are expensive to create, such CUDA streams, stream pools, and handles to other CUDA libraries like `cublas` and `cusolver`. + +The example below demonstrates creating a RAFT handle and using it with `device_matrix` and `device_vector` to allocate memory, generating random clusters, and computing +pairwise Euclidean distances: + +```c++ +#include +#include +#include +#include + +raft::handle_t handle; + +int n_samples = 5000; +int n_features = 50; + +auto input = raft::make_device_matrix(handle, n_samples, n_features); +auto labels = raft::make_device_vector(handle, n_samples); +auto output = raft::make_device_matrix(handle, n_samples, n_samples); + +raft::random::make_blobs(handle, input.view(), labels.view()); + +auto metric = raft::distance::DistanceType::L2SqrtExpanded; +raft::distance::pairwise_distance(handle, input.view(), input.view(), output.view(), metric); +``` + +## Python Example + +The `pylibraft` package contains a Python API for RAFT algorithms and primitives. `pylibraft` integrates nicely into other libraries by being very lightweight with minimal dependencies and accepting any object that supports the `__cuda_array_interface__`, such as [CuPy's ndarray](https://docs.cupy.dev/en/stable/user_guide/interoperability.html#rmm). The package is currently limited to pairwise distances and RMAT graph generation, but we will continue adding more in future releases. + +The example below demonstrates computing the pairwise Euclidean distances between CuPy arrays. `pylibraft` is a low-level API that prioritizes efficiency and simplicity over being pythonic, which is shown here by pre-allocating the output memory before invoking the `pairwise_distance` function. Note that CuPy is not a required dependency for `pylibraft`. + +```python +import cupy as cp + +from pylibraft.distance import pairwise_distance + +n_samples = 5000 +n_features = 50 + +in1 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) +in2 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) +output = cp.empty((n_samples, n_samples), dtype=cp.float32) + +pairwise_distance(in1, in2, output, metric="euclidean") +```