diff --git a/README.md b/README.md index bb268a896a..cca9d81e3c 100755 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ While not exhaustive, the following general categories help summarize the accele All of RAFT's C++ APIs can be accessed header-only and optional pre-compiled shared libraries can 1) speed up compile times and 2) enable the APIs to be used without CUDA-enabled compilers. In addition to the C++ library, RAFT also provides 2 Python libraries: -- `pylibraft` - lightweight low-level Python wrappers around RAFT's host-accessible APIs. +- `pylibraft` - lightweight low-level Python wrappers around RAFT's host-accessible "runtime" APIs. - `raft-dask` - multi-node multi-GPU communicator infrastructure for building distributed algorithms on the GPU with Dask. ## Getting started @@ -142,7 +142,7 @@ in2 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) output = pairwise_distance(in1, in2, metric="euclidean") ``` -The `output` array supports [__cuda_array_interface__](https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html#cuda-array-interface-version-2) so it is interoperable with other libraries like CuPy, Numba, and PyTorch that also support it. +The `output` array in the above example is of type `raft.common.device_ndarray`, which supports [__cuda_array_interface__](https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html#cuda-array-interface-version-2) making it interoperable with other libraries like CuPy, Numba, and PyTorch that also support it. CuPy supports DLPack, which also enables zero-copy conversion from `raft.common.device_ndarray` to JAX and Tensorflow. Below is an example of converting the output `pylibraft.device_ndarray` to a CuPy array: ```python diff --git a/docs/source/quick_start.md b/docs/source/quick_start.md index d8cc5ce08b..8734300131 100644 --- a/docs/source/quick_start.md +++ b/docs/source/quick_start.md @@ -8,9 +8,9 @@ RAFT relies heavily on the [RMM](https://github.com/rapidsai/rmm) library which ## Multi-dimensional Spans and Arrays -The APIs in RAFT currently accept raw pointers to device memory and we are in the process of simplifying the APIs with the [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory. +Most of the APIs in RAFT accept [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory. -The `mdarray` forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions: +The `mdarray` is an owning object that forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions: ```c++ #include @@ -118,11 +118,11 @@ auto metric = raft::distance::DistanceType::L2SqrtExpanded; raft::distance::pairwise_distance(handle, input.view(), input.view(), output.view(), metric); ``` -## Python Example +### Python Example -The `pylibraft` package contains a Python API for RAFT algorithms and primitives. `pylibraft` integrates nicely into other libraries by being very lightweight with minimal dependencies and accepting any object that supports the `__cuda_array_interface__`, such as [CuPy's ndarray](https://docs.cupy.dev/en/stable/user_guide/interoperability.html#rmm). The package is currently limited to pairwise distances and RMAT graph generation, but we will continue adding more in future releases. +The `pylibraft` package contains a Python API for RAFT algorithms and primitives. `pylibraft` integrates nicely into other libraries by being very lightweight with minimal dependencies and accepting any object that supports the `__cuda_array_interface__`, such as [CuPy's ndarray](https://docs.cupy.dev/en/stable/user_guide/interoperability.html#rmm). The number of RAFT algorithms exposed in this package is continuing to grow from release to release. -The example below demonstrates computing the pairwise Euclidean distances between CuPy arrays. `pylibraft` is a low-level API that prioritizes efficiency and simplicity over being pythonic, which is shown here by pre-allocating the output memory before invoking the `pairwise_distance` function. Note that CuPy is not a required dependency for `pylibraft`. +The example below demonstrates computing the pairwise Euclidean distances between CuPy arrays. Note that CuPy is not a required dependency for `pylibraft`. ```python import cupy as cp @@ -137,3 +137,34 @@ in2 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) output = pairwise_distance(in1, in2, metric="euclidean") ``` + +The `output` array in the above example is of type `raft.common.device_ndarray`, which supports [__cuda_array_interface__](https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html#cuda-array-interface-version-2) making it interoperable with other libraries like CuPy, Numba, and PyTorch that also support it. CuPy supports DLPack, which also enables zero-copy conversion from `raft.common.device_ndarray` to JAX and Tensorflow. + +Below is an example of converting the output `pylibraft.common.device_ndarray` to a CuPy array: +```python +cupy_array = cp.asarray(output) +``` + +And converting to a PyTorch tensor: +```python +import torch + +torch_tensor = torch.as_tensor(output, device='cuda') +``` + +`pylibraft` also supports writing to a pre-allocated output array so any `__cuda_array_interface__` supported array can be written to in-place: + +```python +import cupy as cp + +from pylibraft.distance import pairwise_distance + +n_samples = 5000 +n_features = 50 + +in1 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) +in2 = cp.random.random_sample((n_samples, n_features), dtype=cp.float32) +output = cp.empty((n_samples, n_samples), dtype=cp.float32) + +pairwise_distance(in1, in2, out=output, metric="euclidean") +``` diff --git a/python/pylibraft/pylibraft/__init__.py b/python/pylibraft/pylibraft/__init__.py index 1124c64102..a09821c216 100644 --- a/python/pylibraft/pylibraft/__init__.py +++ b/python/pylibraft/pylibraft/__init__.py @@ -14,6 +14,7 @@ # from pylibraft._version import get_versions +from pylibraft.config import config __version__ = get_versions()["version"] del get_versions diff --git a/python/pylibraft/pylibraft/cluster/kmeans.pyx b/python/pylibraft/pylibraft/cluster/kmeans.pyx index 9097eccfa8..f2e010f6a5 100644 --- a/python/pylibraft/pylibraft/cluster/kmeans.pyx +++ b/python/pylibraft/pylibraft/cluster/kmeans.pyx @@ -45,8 +45,11 @@ from pylibraft.common.cpp.mdspan cimport * from pylibraft.common.cpp.optional cimport optional from pylibraft.common.handle cimport handle_t +from pylibraft.common import auto_convert_output + @auto_sync_handle +@auto_convert_output def compute_new_centroids(X, centroids, labels, @@ -197,6 +200,7 @@ def compute_new_centroids(X, @auto_sync_handle +@auto_convert_output def cluster_cost(X, centroids, handle=None): """ Compute cluster cost given an input matrix and existing centroids @@ -403,6 +407,7 @@ FitOutput = namedtuple("FitOutput", "centroids inertia n_iter") @auto_sync_handle +@auto_convert_output def fit( KMeansParams params, X, centroids=None, sample_weights=None, handle=None ): diff --git a/python/pylibraft/pylibraft/common/__init__.py b/python/pylibraft/pylibraft/common/__init__.py index 33c2986487..31248da999 100644 --- a/python/pylibraft/pylibraft/common/__init__.py +++ b/python/pylibraft/pylibraft/common/__init__.py @@ -17,3 +17,4 @@ from .cuda import Stream from .device_ndarray import device_ndarray from .handle import Handle +from .outputs import auto_convert_output diff --git a/python/pylibraft/pylibraft/common/outputs.py b/python/pylibraft/pylibraft/common/outputs.py new file mode 100644 index 0000000000..e5b08e1798 --- /dev/null +++ b/python/pylibraft/pylibraft/common/outputs.py @@ -0,0 +1,93 @@ +# Copyright (c) 2022, NVIDIA CORPORATION. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import functools +import warnings + +import pylibraft.config + + +def import_warn_(lib): + warnings.warn( + "%s is not available and output cannot be converted." + "Returning original output instead." % lib + ) + + +def convert_to_torch(device_ndarray): + try: + import torch + + return torch.as_tensor(device_ndarray, device="cuda") + except ImportError: + import_warn_("PyTorch") + return device_ndarray + + +def convert_to_cupy(device_ndarray): + try: + import cupy + + return cupy.asarray(device_ndarray) + except ImportError: + import_warn_("CuPy") + return device_ndarray + + +def no_conversion(device_ndarray): + return device_ndarray + + +def convert_to_cai_type(device_ndarray): + output_as_ = pylibraft.config.output_as_ + if callable(output_as_): + return output_as_(device_ndarray) + elif output_as_ == "raft": + return device_ndarray + elif output_as_ == "torch": + return convert_to_torch(device_ndarray) + elif output_as_ == "cupy": + return convert_to_cupy(device_ndarray) + else: + raise ValueError("No valid type conversion found for %s" % output_as_) + + +def conv(ret): + for i in ret: + if isinstance(i, pylibraft.common.device_ndarray): + yield convert_to_cai_type(i) + else: + yield i + + +def auto_convert_output(f): + """Decorator to automatically convert an output device_ndarray + (or list or tuple of device_ndarray) into the configured + `__cuda_array_interface__` compliant type. + """ + + @functools.wraps(f) + def wrapper(*args, **kwargs): + ret_value = f(*args, **kwargs) + if isinstance(ret_value, pylibraft.common.device_ndarray): + return convert_to_cai_type(ret_value) + elif isinstance(ret_value, tuple): + return tuple(conv(ret_value)) + elif isinstance(ret_value, list): + return list(conv(ret_value)) + else: + return ret_value + + return wrapper diff --git a/python/pylibraft/pylibraft/config.py b/python/pylibraft/pylibraft/config.py new file mode 100644 index 0000000000..c34da546e0 --- /dev/null +++ b/python/pylibraft/pylibraft/config.py @@ -0,0 +1,47 @@ +# Copyright (c) 2022, NVIDIA CORPORATION. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +SUPPORTED_OUTPUT_TYPES = ["torch", "cupy", "raft"] + + +class config: + output_as_ = "raft" # By default, return device_ndarray from functions + + @classmethod + def set_output_as(self, output): + """ + Set output format for RAFT functions. + + Calling this function will change the output type of RAFT functions. + By default RAFT returns a `pylibraft.common.device_ndarray` for arrays + on GPU memory. Calling `set_output_as` allows you to have RAFT return + arrays as cupy arrays or pytorch tensors instead. You can also have + RAFT convert the output to other frameworks by passing a callable to + do the conversion here. + + Notes + ----- + Returning arrays in cupy or torch format requires you to install + cupy or torch. + + Parameters + ---------- + output : { "raft", "cupy", "torch" } or callable + The output format to convert to. Can either be a str describing the + framework to convert to, or a callable that accepts a + device_ndarray and returns the converted type. + """ + if output not in SUPPORTED_OUTPUT_TYPES and not callable(output): + raise ValueError("Unsupported output option " % output) + config.output_as_ = output diff --git a/python/pylibraft/pylibraft/distance/fused_l2_nn.pyx b/python/pylibraft/pylibraft/distance/fused_l2_nn.pyx index a21fe46fa3..ce8e656822 100644 --- a/python/pylibraft/pylibraft/distance/fused_l2_nn.pyx +++ b/python/pylibraft/pylibraft/distance/fused_l2_nn.pyx @@ -26,7 +26,12 @@ from libcpp cimport bool from .distance_type cimport DistanceType -from pylibraft.common import Handle, cai_wrapper, device_ndarray +from pylibraft.common import ( + Handle, + auto_convert_output, + cai_wrapper, + device_ndarray, +) from pylibraft.common.handle import auto_sync_handle from pylibraft.common.handle cimport handle_t @@ -57,6 +62,7 @@ cdef extern from "raft_runtime/distance/fused_l2_nn.hpp" \ @auto_sync_handle +@auto_convert_output def fused_l2_nn_argmin(X, Y, out=None, sqrt=True, handle=None): """ Compute the 1-nearest neighbors between X and Y using the L2 distance diff --git a/python/pylibraft/pylibraft/distance/pairwise_distance.pyx b/python/pylibraft/pylibraft/distance/pairwise_distance.pyx index 6f7a135951..2ed2b8ed57 100644 --- a/python/pylibraft/pylibraft/distance/pairwise_distance.pyx +++ b/python/pylibraft/pylibraft/distance/pairwise_distance.pyx @@ -31,7 +31,7 @@ from pylibraft.common.handle import auto_sync_handle from pylibraft.common.handle cimport handle_t -from pylibraft.common import cai_wrapper, device_ndarray +from pylibraft.common import auto_convert_output, cai_wrapper, device_ndarray cdef extern from "raft_runtime/distance/pairwise_distance.hpp" \ @@ -89,6 +89,7 @@ SUPPORTED_DISTANCES = ["euclidean", "l1", "cityblock", "l2", "inner_product", @auto_sync_handle +@auto_convert_output def distance(X, Y, out=None, metric="euclidean", p=2.0, handle=None): """ Compute pairwise distances between X and Y diff --git a/python/pylibraft/pylibraft/neighbors/ivf_pq/ivf_pq.pyx b/python/pylibraft/pylibraft/neighbors/ivf_pq/ivf_pq.pyx index fdc8d1755c..6ad9b753b3 100644 --- a/python/pylibraft/pylibraft/neighbors/ivf_pq/ivf_pq.pyx +++ b/python/pylibraft/pylibraft/neighbors/ivf_pq/ivf_pq.pyx @@ -33,7 +33,12 @@ from libcpp cimport bool, nullptr from pylibraft.distance.distance_type cimport DistanceType -from pylibraft.common import Handle, cai_wrapper, device_ndarray +from pylibraft.common import ( + Handle, + auto_convert_output, + cai_wrapper, + device_ndarray, +) from pylibraft.common.interruptible import cuda_interruptible from pylibraft.common.handle cimport handle_t @@ -302,6 +307,7 @@ cdef class Index: @auto_sync_handle +@auto_convert_output def build(IndexParams index_params, dataset, handle=None): """ Builds an IVF-PQ index that can be later used for nearest neighbor search. @@ -401,6 +407,7 @@ def build(IndexParams index_params, dataset, handle=None): @auto_sync_handle +@auto_convert_output def extend(Index index, new_vectors, new_indices, handle=None): """ Extend an existing index with new vectors. @@ -565,6 +572,7 @@ cdef class SearchParams: @auto_sync_handle +@auto_convert_output def search(SearchParams search_params, Index index, queries, diff --git a/python/pylibraft/pylibraft/neighbors/refine.pyx b/python/pylibraft/pylibraft/neighbors/refine.pyx index ca328c1cd5..37ef69e7b5 100644 --- a/python/pylibraft/pylibraft/neighbors/refine.pyx +++ b/python/pylibraft/pylibraft/neighbors/refine.pyx @@ -33,7 +33,12 @@ from libcpp cimport bool, nullptr from pylibraft.distance.distance_type cimport DistanceType -from pylibraft.common import Handle, cai_wrapper, device_ndarray +from pylibraft.common import ( + Handle, + auto_convert_output, + cai_wrapper, + device_ndarray, +) from pylibraft.common.handle cimport handle_t @@ -208,6 +213,7 @@ cdef host_matrix_view[int8_t, uint64_t, row_major] \ @auto_sync_handle +@auto_convert_output def refine(dataset, queries, candidates, k=None, indices=None, distances=None, metric="l2_expanded", handle=None): """ diff --git a/python/pylibraft/pylibraft/test/test_config.py b/python/pylibraft/pylibraft/test/test_config.py new file mode 100644 index 0000000000..3acb7549c5 --- /dev/null +++ b/python/pylibraft/pylibraft/test/test_config.py @@ -0,0 +1,54 @@ +# Copyright (c) 2022, NVIDIA CORPORATION. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +import cupy +import pytest + +import pylibraft.config +from pylibraft.common import auto_convert_output, device_ndarray + +pytest.importorskip("cupy") + + +@auto_convert_output +def gen_cai(m, n, t=None): + if t is None: + return device_ndarray.empty((m, n)) + elif t == tuple: + return device_ndarray.empty((m, n)), device_ndarray.empty((m, n)) + elif t == list: + return [device_ndarray.empty((m, n)), device_ndarray.empty((m, n))] + + +@pytest.mark.parametrize( + "out_type", + [["cupy", cupy.ndarray], ["raft", pylibraft.common.device_ndarray]], +) +@pytest.mark.parametrize("gen_t", [None, tuple, list]) +def test_auto_convert_output(out_type, gen_t): + + conf, t = out_type + pylibraft.config.set_output_as(conf) + + output = gen_cai(1, 5, gen_t) + + if not isinstance(output, (list, tuple)): + assert isinstance(output, t) + + else: + for o in output: + assert isinstance(o, t) + + # Make sure we set the config back to default + pylibraft.config.set_output_as("raft")