Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mdspan PoC for distance make_blobs #538

Merged
merged 14 commits into from
Mar 9, 2022
69 changes: 53 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,37 +33,51 @@ The Python API is being improved to wrap the algorithms and primitives from the
## Getting started

### Rapids Memory Manager (RMM)
cjnolet marked this conversation as resolved.
Show resolved Hide resolved
RAFT relies heavily on [RMM](https://github.com/rapidsai/rmm) which,
like other projects in the RAPIDS ecosystem, eases the burden of configuring different allocation strategies globally
across the libraries that use it. RMM also provides [RAII](https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization)) wrappers around device arrays that handle the allocation and cleanup.

RAFT relies heavily on RMM which, like other projects in the RAPIDS ecosystem, eases the burden of configuring different allocation strategies globally across the libraries that use it.

### Multi-dimensional Arrays

The APIs in RAFT currently accept raw pointers to device memory and we are in the process of simplifying the APIs with the [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory.

The `mdarray` forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions:

```c++
#include <raft/mdarray.hpp>

int n_rows = 10;
int n_cols = 10;

auto scalar = raft::make_device_scalar(handle, 1.0);
auto vector = raft::make_device_vector(handle, n_cols);
auto matrix = raft::make_device_matrix(handle, n_rows, n_cols);
```

### C++ Example

Most of the primitives in RAFT accept a `raft::handle_t` object for the management of resources which are expensive to create, such CUDA streams, stream pools, and handles to other CUDA libraries like `cublas` and `cusolver`.

The example below demonstrates creating a RAFT handle and using it with RMM's `device_uvector` to allocate memory on device and compute
The example below demonstrates creating a RAFT handle and using it with `device_matrix` and `device_vector` to allocate memory, generating random clusters, and computing
cjnolet marked this conversation as resolved.
Show resolved Hide resolved
cjnolet marked this conversation as resolved.
Show resolved Hide resolved
pairwise Euclidean distances:
```c++
#include <raft/handle.hpp>
#include <raft/distance/distance.hpp>
#include <raft/mdarray.hpp>
#include <raft/random/make_blobs.cuh>
#include <raft/distance/distance.cuh>

#include <rmm/device_uvector.hpp>
raft::handle_t handle;

int n_samples = ...;
int n_features = ...;
int n_samples = 5000;
int n_features = 50;

rmm::device_uvector<float> input(n_samples * n_features, handle.get_stream());
rmm::device_uvector<float> output(n_samples * n_samples, handle.get_stream());
auto input = raft::make_device_matrix<float>(handle, n_samples, n_features);
auto labels = raft::make_device_vector<int>(handle, n_samples);
auto output = raft::make_device_matrix<float>(handle, n_samples, n_samples);

// ... Populate feature matrix ...
raft::random::make_blobs(handle, input, labels);

auto metric = raft::distance::DistanceType::L2SqrtExpanded;
rmm::device_uvector<char> workspace(0, handle.get_stream());
raft::distance::pairwise_distance(handle, input.data(), input.data(),
output.data(),
n_samples, n_samples, n_features,
workspace.data(), metric);
raft::distance::pairwise_distance(handle, input.view(), input.view(), output.view(), metric);
```
## Installing
Expand Down Expand Up @@ -159,3 +173,26 @@ The folder structure mirrors other RAPIDS repos (cuDF, cuML, cuGraph...), with t
## Contributing

If you are interested in contributing to the RAFT project, please read our [Contributing guidelines](CONTRIBUTING.md). Refer to the [Developer Guide](DEVELOPER_GUIDE.md) for details on the developer guidelines, workflows, and principals.

## References

When citing RAFT generally, please consider referencing this Github project.
```bibtex
@misc{rapidsai,
title={Rapidsai/raft: RAFT contains fundamental widely-used algorithms and primitives for data science, Graph and machine learning.},
url={https://github.com/rapidsai/raft},
journal={GitHub},
publisher={Nvidia RAPIDS},
author={Rapidsai},
year={2022}
}
```
If citing the sparse pairwise distances API, please consider using the following bibtex:
```bibtex
@article{nolet2021semiring,
title={Semiring primitives for sparse neighborhood methods on the gpu},
author={Nolet, Corey J and Gala, Divye and Raff, Edward and Eaton, Joe and Rees, Brad and Zedlewski, John and Oates, Tim},
journal={arXiv preprint arXiv:2104.06357},
year={2021}
}
```
6 changes: 5 additions & 1 deletion cpp/include/raft.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -14,6 +14,10 @@
* limitations under the License.
*/

#include "raft/handle.hpp"
#include "raft/mdarray.hpp"
#include "raft/span.hpp"

#include <string>

namespace raft {
Expand Down
138 changes: 135 additions & 3 deletions cpp/include/raft/distance/distance.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
#include <raft/handle.hpp>
#include <rmm/device_uvector.hpp>

#include <raft/mdarray.hpp>

namespace raft {
namespace distance {

Expand Down Expand Up @@ -144,6 +146,35 @@ size_t getWorkspaceSize(const InType* x, const InType* y, Index_ m, Index_ n, In
return detail::getWorkspaceSize<distanceType, InType, AccType, OutType, Index_>(x, y, m, n, k);
}

/**
* @brief Return the exact workspace size to compute the distance
* @tparam DistanceType which distance to evaluate
* @tparam InType input argument type
* @tparam AccType accumulation type
* @tparam OutType output type
* @tparam Index_ Index type
* @param x first set of points (size m*k)
* @param y second set of points (size n*k)
* @return number of bytes needed in workspace
*
* @note If the specified distanceType doesn't need the workspace at all, it
* returns 0.
*/
template <raft::distance::DistanceType distanceType,
typename InType,
typename AccType,
typename OutType,
typename Index_ = int,
typename layout>
size_t getWorkspaceSize(const raft::device_matrix_view<InType, layout> x,
const raft::device_matrix_view<InType, layout> y)
{
RAFT_EXPECTS(x.extent(1) == y.extent(1), "Number of columns must be equal.");

return getWorkspaceSize<distanceType, InType, AccType, OutType, Index_>(
x.data(), y.data(), x.extent(0), y.extent(0), x.extent(1));
}

/**
* @brief Evaluate pairwise distances for the simple use case
* @tparam DistanceType which distance to evaluate
Expand All @@ -160,9 +191,6 @@ size_t getWorkspaceSize(const InType* x, const InType* y, Index_ m, Index_ n, In
* @param stream cuda stream
* @param isRowMajor whether the matrices are row-major or col-major
* @param metric_arg metric argument (used for Minkowski distance)
*
* @note if workspace is passed as nullptr, this will return in
* worksize, the number of bytes of workspace required
*/
template <raft::distance::DistanceType distanceType,
typename InType,
Expand All @@ -186,6 +214,58 @@ void distance(const InType* x,
x, y, dist, m, n, k, workspace.data(), worksize, stream, isRowMajor, metric_arg);
}

/**
* @brief Evaluate pairwise distances for the simple use case.
*
* Note: Only contiguous row- or column-major layouts supported currently.
*
* @tparam DistanceType which distance to evaluate
* @tparam InType input argument type
* @tparam AccType accumulation type
* @tparam OutType output type
* @tparam Index_ Index type
* @param handle raft handle for managing expensive resources
* @param x first set of points (size n*k)
* @param y second set of points (size m*k)
* @param dist output distance matrix (size n*m)
* @param metric_arg metric argument (used for Minkowski distance)
*/
template <raft::distance::DistanceType distanceType,
typename InType,
typename AccType,
typename OutType,
typename Index_ = int,
typename layout = raft::layout_c_contiguous>
void distance(raft::handle_t const& handle,
raft::device_matrix_view<InType, layout> const x,
raft::device_matrix_view<InType, layout> const y,
raft::device_matrix_view<OutType, layout> dist,
InType metric_arg = 2.0f)
{
RAFT_EXPECTS(x.extent(1) == y.extent(1), "Number of columns must be equal.");
RAFT_EXPECTS(dist.extent(0) == x.extent(0),
"Number of rows in output must be equal to "
"number of rows in X");
RAFT_EXPECTS(dist.extent(1) == y.extent(0),
"Number of columns in output must be equal to "
"number of rows in Y");

RAFT_EXPECTS(x.is_contiguous(), "Input x must be contiguous.");
RAFT_EXPECTS(y.is_contiguous(), "Input y must be contiguous.");

auto is_rowmajor = std::is_same<layout, layout_c_contiguous>::value;

distance<distanceType, InType, AccType, OutType, Index_>(x.data(),
y.data(),
dist.data(),
x.extent(0),
y.extent(0),
x.extent(1),
handle.get_stream(),
is_rowmajor,
metric_arg);
}

/**
* @defgroup pairwise_distance pairwise distance prims
* @{
Expand Down Expand Up @@ -319,6 +399,58 @@ void pairwise_distance(const raft::handle_t& handle,
handle, x, y, dist, m, n, k, workspace, metric, isRowMajor, metric_arg);
}

/**
* @defgroup pairwise_distance pairwise distance prims
* @{
* @brief Convenience wrapper around 'distance' prim to convert runtime metric
* into compile time for the purpose of dispatch
* @tparam Type input/accumulation/output data-type
* @tparam Index_ indexing type
* @param x first matrix of points (size mxk)
* @param y second matrix of points (size nxk)
* @param dist output distance matrix (size mxn)
* @param workspace temporary workspace buffer which can get resized as per the
* needed workspace size
* @param metric distance metric
* @param stream cuda stream
* @param isRowMajor whether the matrices are row-major or col-major
*/
template <typename Type, typename Index_ = int, typename layout = layout_c_contiguous>
void pairwise_distance(raft::handle_t const& handle,
device_matrix_view<Type, layout> const x,
device_matrix_view<Type, layout> const y,
device_matrix_view<Type, layout> dist,
raft::distance::DistanceType metric,
Type metric_arg = 2.0f)
{
RAFT_EXPECTS(x.extent(1) == y.extent(1), "Number of columns must be equal.");
RAFT_EXPECTS(dist.extent(0) == x.extent(0),
"Number of rows in output must be equal to "
"number of rows in X");
RAFT_EXPECTS(dist.extent(1) == y.extent(0),
"Number of columns in output must be equal to "
"number of rows in Y");

RAFT_EXPECTS(x.is_contiguous(), "Input x must be contiguous.");
RAFT_EXPECTS(y.is_contiguous(), "Input y must be contiguous.");
RAFT_EXPECTS(dist.is_contiguous(), "Output must be contiguous.");

bool rowmajor = x.stride(0) == 0;

rmm::device_uvector<char> workspace(0, handle.get_stream());

pairwise_distance(handle,
x.data(),
y.data(),
dist.data(),
x.extent(0),
y.extent(0),
x.extent(1),
metric,
rowmajor,
metric_arg);
}

}; // namespace distance
}; // namespace raft

Expand Down
Loading