Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to use cuVS for vector search #6085

Merged
merged 52 commits into from
Oct 4, 2024
Merged

Conversation

benfred
Copy link
Member

@benfred benfred commented Sep 26, 2024

This PR updates to use cuVS instead of RAFT for vector search, pairwise distances and clustering. This is required for us to deprecate the vector search functionality in RAFT, in favour of the code in cuVS.

Because some code hasn't been migrated over to cuvs yet, we will continue to use the version in RAFT - but with RAFT in header only mode. In particular this functionality will be used in RAFT header only mode:

Because sparse KNN in RAFT uses the DistanceType in RAFT, we can't fully move over to use the DistanceType code in cuVS with this PR. (Also the DistanceType code in RAFT has a Precomputed option that isn't available in cuvs - but is needed by cuml for dbscan.) This means that we have both the raft and cuvs DistanceType enum's in use with this change, with conversions between them.

@github-actions github-actions bot added Cython / Python Cython or Python issue CMake CUDA/C++ labels Sep 26, 2024
@github-actions github-actions bot added conda conda issue ci labels Sep 26, 2024
@cjnolet cjnolet added the non-breaking Non-breaking change label Sep 27, 2024
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took an early peek at the packaging side of this change, since this will probably be coming down close to freeze. Ping if you need help with any build issues / etc.

ci/release/update-version.sh Outdated Show resolved Hide resolved
dependencies.yaml Show resolved Hide resolved
We don't currently have cosine distance for ivf-pq (see rapidsai/cuvs#346)
and we also don't have correlation distance support at all. re-add the metricprocessor
code to handle this
cpp/cmake/thirdparty/get_cuvs.cmake Outdated Show resolved Hide resolved
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes and everything looks good to me, pending returning raft to branch-24.12

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great! I'm going to pre-approve just to get this in for the release, but don't forget to revert the explicit cmake pinning.

@@ -14,6 +14,7 @@ dependencies:
- cudatoolkit
- cudf==24.10.*,>=0.0.0a0
- cupy>=12.0.0
- cuvs==24.10.*,>=0.0.0a0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think cuml needs a dependency on cuvs python today. We could keep this for the release, though, just to avoid additional changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct here (we only need cuvs for the requirements/pyproject outputs for wheels). Let's file a follow-up PR once this one merges so we don't lose track. It can target 24.12.

Thankfully, cuvs is only ~350 kB: https://anaconda.org/rapidsai-nightly/cuvs/files

cpp/cmake/thirdparty/get_cuvs.cmake Outdated Show resolved Hide resolved
@@ -18,7 +18,7 @@

#include <cuml/common/log_levels.hpp>

#include <raft/cluster/kmeans_types.hpp>
#include <cuvs/cluster/kmeans.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@@ -24,8 +24,8 @@
#include <cuml/cluster/hdbscan.hpp>
#include <cuml/common/logger.hpp>

#include <raft/cluster/detail/agglomerative.cuh>
#include <raft/cluster/detail/mst.cuh>
#include <raft/cluster/detail/agglomerative.cuh> // build_dendrogram_host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy- detail APIs. Well need to make sure we add this to the tracker list. The single-linkage implementation from RAFT has been migrated to cuVS so we should be able to use that. No need to do that for 24.10, though.

@@ -1,818 +0,0 @@
/*
* Copyright (c) 2020-2024, NVIDIA CORPORATION.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful to see this go. Eventually, we should start building distributed Python implementations right in cuVS for things like cuvs-dask (and cuvs-ray, etc...).

// fully serial execution.
raft::resource::sync_stream_pool(handle);

if (input.size() > 1 || translations != nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the most ideal case, we should be able to formalize an "id_mapper" API at some point and have it used consistently for all index types in cuVS itself. That's a pretty trivial gather operation that could be accelerated.

cpp/cmake/thirdparty/get_cuvs.cmake Outdated Show resolved Hide resolved
cpp/bench/CMakeLists.txt Outdated Show resolved Hide resolved
cpp/test/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -14,6 +14,7 @@ dependencies:
- cudatoolkit
- cudf==24.10.*,>=0.0.0a0
- cupy>=12.0.0
- cuvs==24.10.*,>=0.0.0a0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct here (we only need cuvs for the requirements/pyproject outputs for wheels). Let's file a follow-up PR once this one merges so we don't lose track. It can target 24.12.

Thankfully, cuvs is only ~350 kB: https://anaconda.org/rapidsai-nightly/cuvs/files

cpp/cmake/thirdparty/get_cuvs.cmake Outdated Show resolved Hide resolved
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously reviewed all the build/packaging changes, and am approving the new diff since my last look. I did not review the C++ code. Thanks for the hard work on this!

We need to do some planning on all the follow-up work that is needed to simplify cuVS and cuML. The build/packaging side of things is getting messy in both repositories with all the short term fixes.

@cjnolet
Copy link
Member

cjnolet commented Oct 4, 2024

/merge

@rapids-bot rapids-bot bot merged commit 70fe526 into rapidsai:branch-24.10 Oct 4, 2024
62 of 63 checks passed
@benfred benfred deleted the cuvs branch October 4, 2024 03:05
vyasr added a commit to rapidsai/devcontainers that referenced this pull request Dec 2, 2024
This PR fixes a number of issues in the manifest:
- cudf's dependency on rmm was accidentally removed in #221. Since rmm
is header-only I guess nobody noticed the issue since it doesn't
increase build times and only shows up if you attempt to make local
changes to rmm and test them in cudf.
- UCXX_ENABLE_PYTHON was removed in
rapidsai/ucxx#257
- cuml depends on cuvs as of rapidsai/cuml#6085
- cugraph does not depend on cugraph-ops as of
rapidsai/cugraph#4744
- Many packages were relying on transitively getting rmm or raft as
dependencies, which causes problems when the intermediate dependency is
removed. Extra `-D<package_name>_ROOT` variables in the CMake configure
don't cause any problems so I made everything explicit.

---------

Co-authored-by: Paul Taylor <[email protected]>
Co-authored-by: Bradley Dice <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake conda conda issue CUDA/C++ Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

5 participants