-
Notifications
You must be signed in to change notification settings - Fork 197
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Brad Rees (https://github.com/BradReesWork) URL: #351
- Loading branch information
Showing
1 changed file
with
83 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,94 @@ | ||
# <div align="left"><img src="https://rapids.ai/assets/images/rapids_logo.png" width="90px"/> RAFT: RAPIDS Analytics Frameworks Toolset</div> | ||
# <div align="left"><img src="https://rapids.ai/assets/images/rapids_logo.png" width="90px"/> RAFT: RAPIDS Analytics Framework Toolkit</div> | ||
|
||
RAFT is a repository containining shared utilities, mathematical operations and common functions for the analytics components of RAPIDS. Both the C++ and Python components can be included in consuming libraries. | ||
RAFT is a library containing building-blocks for rapid composition of RAPIDS Analytics. These building-blocks include shared representations, mathematical computational primitives, and utilities that accelerate building analytics and data science algorithms in the RAPIDS ecosystem. Both the C++ and Python components can be included in consuming libraries, providing building-blocks for both dense and sparse matrix formats in the following general categories: | ||
##### | ||
| Category | Description / Examples | | ||
| --- | --- | | ||
| **Data Formats** | tensor representations and conversions for both sparse and dense formats | | ||
| **Data Generation** | graph, spatial, and machine learning dataset generation | | ||
| **Dense Operations** | linear algebra, statistics | | ||
| **Spatial** | pairwise distances, nearest neighbors, neighborhood / proximity graph construction | | ||
| **Sparse/Graph Operations** | linear algebra, statistics, slicing, msf, spectral embedding/clustering, slhc, vertex degree | | ||
| **Solvers** | eigenvalue decomposition, least squares, lanczos | | ||
| **Tools** | multi-node multi-gpu communicator, utilities | | ||
|
||
By taking a primitives-based approach to algorithm development, RAFT accelerates algorithm construction time and reduces | ||
the maintenance burden by maximizing reuse across projects. RAFT relies on the [RAPIDS memory manager (RMM)](https://github.com/rapidsai/rmm) which, | ||
like other projects in the RAPIDS ecosystem, eases the burden of configuring different allocation strategies globally | ||
across the libraries that use it. RMM also provides RAII wrappers around device arrays that handle the allocation and cleanup. | ||
|
||
## Getting started | ||
|
||
Refer to the [Build and Development Guide](BUILD.md) for details on RAFT's design, building, testing and development guidelines. | ||
|
||
Most of the primitives in RAFT accept a `raft::handle_t` object for the management of resources which are expensive to create, such CUDA streams, stream pools, and handles to other CUDA libraries like `cublas` and `cusolver`. | ||
|
||
|
||
### C++ Example | ||
|
||
The example below demonstrates creating a RAFT handle and using it with RMM's `device_uvector` to allocate memory on device and compute | ||
pairwise Euclidean distances: | ||
```c++ | ||
#include <raft/handle.hpp> | ||
#include <raft/distance/distance.hpp> | ||
|
||
#include <rmm/device_uvector.hpp> | ||
raft::handle_t handle; | ||
|
||
int n_samples = ...; | ||
int n_features = ...; | ||
|
||
rmm::device_uvector<float> input(n_samples * n_features, handle.get_stream()); | ||
rmm::device_uvector<float> output(n_samples * n_samples, handle.get_stream()); | ||
|
||
// ... Populate feature matrix ... | ||
|
||
auto metric = raft::distance::DistanceType::L2SqrtExpanded; | ||
rmm::device_uvector<char> workspace(0, handle.get_stream()); | ||
raft::distance::pairwise_distance(handle, input.data(), input.data(), | ||
output.data(), | ||
n_samples, n_samples, n_features, | ||
workspace.data(), metric); | ||
``` | ||
## Folder Structure and Contents | ||
The folder structure mirrors the main RAPIDS repos (cuDF, cuML, cuGraph...), with the following folders: | ||
The folder structure mirrors other RAPIDS repos (cuDF, cuML, cuGraph...), with the following folders: | ||
- `cpp`: Source code for all C++ code. The code is header only, therefore it is in the `include` folder (with no `src`). | ||
- `cpp`: Source code for all C++ code. The code is currently header-only, therefore it is in the `include` folder (with no `src`). | ||
- `python`: Source code for all Python source code. | ||
- `ci`: Scripts for running CI in PRs | ||
[comment]: <> (TODO: This needs to be updated after the public API is established) | ||
[comment]: <> (The library layout contains the following structure:) | ||
[comment]: <> (```bash) | ||
[comment]: <> (cpp/include/raft) | ||
[comment]: <> ( |------------ comms [communication abstraction layer]) | ||
[comment]: <> ( |------------ distance [dense pairwise distances]) | ||
[comment]: <> ( |------------ linalg [dense linear algebra]) | ||
[comment]: <> ( |------------ matrix [dense matrix format]) | ||
[comment]: <> ( |------------ random [random matrix generation]) | ||
[comment]: <> ( |------------ sparse [sparse matrix and graph algorithms]) | ||
[comment]: <> ( |------------ spatial [spatial algorithms]) | ||
[comment]: <> ( |------------ spectral [spectral clustering]) | ||
[comment]: <> ( |------------ stats [statistics primitives]) | ||
[comment]: <> ( |------------ handle.hpp [raft handle]) | ||
[comment]: <> (```) | ||