From 22d1fc61a241baea685304b5e7025f65220e3489 Mon Sep 17 00:00:00 2001 From: "Corey J. Nolet" Date: Sat, 6 Nov 2021 10:05:18 -0400 Subject: [PATCH] README updates (#351) Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Brad Rees (https://github.com/BradReesWork) URL: https://github.com/rapidsai/raft/pull/351 --- README.md | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 83 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 546a1df1f0..8091c345e1 100755 --- a/README.md +++ b/README.md @@ -1,15 +1,94 @@ -#
 RAFT: RAPIDS Analytics Frameworks Toolset
+#
 RAFT: RAPIDS Analytics Framework Toolkit
-RAFT is a repository containining shared utilities, mathematical operations and common functions for the analytics components of RAPIDS. Both the C++ and Python components can be included in consuming libraries. +RAFT is a library containing building-blocks for rapid composition of RAPIDS Analytics. These building-blocks include shared representations, mathematical computational primitives, and utilities that accelerate building analytics and data science algorithms in the RAPIDS ecosystem. Both the C++ and Python components can be included in consuming libraries, providing building-blocks for both dense and sparse matrix formats in the following general categories: +##### +| Category | Description / Examples | +| --- | --- | +| **Data Formats** | tensor representations and conversions for both sparse and dense formats | +| **Data Generation** | graph, spatial, and machine learning dataset generation | +| **Dense Operations** | linear algebra, statistics | +| **Spatial** | pairwise distances, nearest neighbors, neighborhood / proximity graph construction | +| **Sparse/Graph Operations** | linear algebra, statistics, slicing, msf, spectral embedding/clustering, slhc, vertex degree | +| **Solvers** | eigenvalue decomposition, least squares, lanczos | +| **Tools** | multi-node multi-gpu communicator, utilities | + +By taking a primitives-based approach to algorithm development, RAFT accelerates algorithm construction time and reduces +the maintenance burden by maximizing reuse across projects. RAFT relies on the [RAPIDS memory manager (RMM)](https://github.com/rapidsai/rmm) which, +like other projects in the RAPIDS ecosystem, eases the burden of configuring different allocation strategies globally +across the libraries that use it. RMM also provides RAII wrappers around device arrays that handle the allocation and cleanup. + +## Getting started Refer to the [Build and Development Guide](BUILD.md) for details on RAFT's design, building, testing and development guidelines. +Most of the primitives in RAFT accept a `raft::handle_t` object for the management of resources which are expensive to create, such CUDA streams, stream pools, and handles to other CUDA libraries like `cublas` and `cusolver`. + + +### C++ Example + +The example below demonstrates creating a RAFT handle and using it with RMM's `device_uvector` to allocate memory on device and compute +pairwise Euclidean distances: +```c++ +#include +#include + +#include +raft::handle_t handle; + +int n_samples = ...; +int n_features = ...; + +rmm::device_uvector input(n_samples * n_features, handle.get_stream()); +rmm::device_uvector output(n_samples * n_samples, handle.get_stream()); + +// ... Populate feature matrix ... + +auto metric = raft::distance::DistanceType::L2SqrtExpanded; +rmm::device_uvector workspace(0, handle.get_stream()); +raft::distance::pairwise_distance(handle, input.data(), input.data(), + output.data(), + n_samples, n_samples, n_features, + workspace.data(), metric); +``` + + + + ## Folder Structure and Contents -The folder structure mirrors the main RAPIDS repos (cuDF, cuML, cuGraph...), with the following folders: +The folder structure mirrors other RAPIDS repos (cuDF, cuML, cuGraph...), with the following folders: -- `cpp`: Source code for all C++ code. The code is header only, therefore it is in the `include` folder (with no `src`). +- `cpp`: Source code for all C++ code. The code is currently header-only, therefore it is in the `include` folder (with no `src`). - `python`: Source code for all Python source code. - `ci`: Scripts for running CI in PRs +[comment]: <> (TODO: This needs to be updated after the public API is established) +[comment]: <> (The library layout contains the following structure:) + +[comment]: <> (```bash) + +[comment]: <> (cpp/include/raft) + +[comment]: <> ( |------------ comms [communication abstraction layer]) + +[comment]: <> ( |------------ distance [dense pairwise distances]) + +[comment]: <> ( |------------ linalg [dense linear algebra]) + +[comment]: <> ( |------------ matrix [dense matrix format]) + +[comment]: <> ( |------------ random [random matrix generation]) + +[comment]: <> ( |------------ sparse [sparse matrix and graph algorithms]) + +[comment]: <> ( |------------ spatial [spatial algorithms]) + +[comment]: <> ( |------------ spectral [spectral clustering]) + +[comment]: <> ( |------------ stats [statistics primitives]) + +[comment]: <> ( |------------ handle.hpp [raft handle]) + +[comment]: <> (```) +