Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

Closed
8 tasks
afender opened this issue May 13, 2020 · 14 comments
Closed
8 tasks

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

afender opened this issue May 13, 2020 · 14 comments
Assignees
Milestone

Comments

@afender
Copy link
Member

afender commented May 13, 2020

Motivation
Spectral clustering is used by cuML and should be in RAFT to avoid a circular dependency. A lot of the backend building blocks can be used independently ( like lanczos solver and kmeans).
However, this is a legacy code from nvgraph that can't be transferred as is.

The following items should be taken care of :

  • Decouple from nvgraph structures, errors etc. These should not be transferred to RAFT.
  • Refactor to remove deprecated cusparse calls (next CUDA version support)
  • Use RAFT's cusparse wrapper
  • Drop the sparse matrix class.
  • Transfer code to RAFT
  • Make it deterministic.
  • adjust cugraph code to get Spectral Clustering building blocks from RAFT (cython and C++ API remain in cuGraph).
  • adjust cuML code to get Spectral Clustering from RAFT in UMAP.
@afender afender added the ? - Needs Triage Need team to review and classify label May 13, 2020
@afender afender added this to the 0.15 milestone May 13, 2020
@BradReesWork BradReesWork added Refactor and removed ? - Needs Triage Need team to review and classify labels May 20, 2020
@aschaffer
Copy link
Collaborator

Can I get a bit more details about requirements, specifically:

  1. "Refactor to remove deprecated cusparse calls (next CUDA version support)" I assume that's all the cublas*() trampoline functions from src/nvgraph/include/spectral_matrix.hxx, correct?
  2. "Use RAFT's cusparse wrapper". I assume that's raft/cpp/include/raft/sparse/cusparse_wrappers.h, correct?
  3. "Drop the sparse matrix class". I assume that's class Matrix and its derived classes, DenseMatrix, CsrMatrix, LaplacianMatrix, ModularityMatrix from src/nvgraph/include/spectral_matrix.hxx, correct?
  4. Also, the code transfered to RAFT should be header only (templates), correct?

@aschaffer
Copy link
Collaborator

aschaffer commented May 28, 2020

There is a strong dependency of functions in cugraph/cpp/src/nvgraph/partition.cu on various matrix classes (CsrMatrix, LaplacianMatrix). If these are to be removed (?) in RAFT, what should they be replaced with?

@afender
Copy link
Member Author

afender commented May 29, 2020

Can I get a bit more details about requirements, specifically:

  1. "Refactor to remove deprecated cusparse calls (next CUDA version support)" I assume that's all the cublas*() trampoline functions from src/nvgraph/include/spectral_matrix.hxx, correct?
  2. "Use RAFT's cusparse wrapper". I assume that's raft/cpp/include/raft/sparse/cusparse_wrappers.h, correct?
  3. "Drop the sparse matrix class". I assume that's class Matrix and its derived classes, DenseMatrix, CsrMatrix, LaplacianMatrix, ModularityMatrix from src/nvgraph/include/spectral_matrix.hxx, correct?
  4. Also, the code transfered to RAFT should be header only (templates), correct?

Yes, your understanding is correct.
For (1), this is not limited to the cublas tampolines but all functions from the toolkit that have this message: "[[DEPRECATED]] use cusparseSpMV() instead. The routine will be removed in the next major release" See for instance https://docs.nvidia.com/cuda/cusparse/index.html#csrmv.

There is a strong dependency of functions in cugraph/cpp/src/nvgraph/partition.cu on various matrix classes (CsrMatrix, LaplacianMatrix). If these are to be removed (?) in RAFT, what should they be replaced with?

They are tightly connected to this code indeed. The dilemma is, from what I remember from these matrix classes, that they are not in good enough shape to be the standard sparse matrix classes for the rest of RAFT to build on top it. So it seems two paths can be taken:

  • upgrade so that we have a solid sparse matrix class in RAFT that also fits nicely in cuGraph and cuML. The goal would be to efficiently build other sparse linear algebra features on top of it in RAFT in the long run.
    or
  • the minimum clean up to encapsulate that as part of spectral clustering only as a specific dependency.

@aschaffer
Copy link
Collaborator

How about cugraph::experimental::GraphCSRView<vertex_t, edge_t, weight_t>? Do we also need to create a separate (similar) abstraction for it in RAFT?

@aschaffer
Copy link
Collaborator

Also, @cjnolet , could you please advise on the specific needs for a sparse matrix abstraction in cuML? Thanks in advance.

@cjnolet
Copy link
Member

cjnolet commented May 29, 2020

It would be nice if we had an RAII abstraction that could use device buffers under the hood to manage the device memory for the underlying arrays. We have something very simple in cuML that has cleaned up our code tremendously by replacing the need to manage and pass 3 arrays everywhere.

Here's an example of some of the constructors we use- the port from cuml to raft should be straightforward but I'm also very much open to an abstraction that serves cugraph's needs as well.

https://github.com/rapidsai/cuml/blob/branch-0.15/cpp/src_prims/sparse/csr.cuh#L117

@aschaffer
Copy link
Collaborator

Here's an example of some of the constructors we use- the port from cuml to raft should be straightforward but I'm also very much open to an abstraction that serves cugraph's needs as well.

https://github.com/rapidsai/cuml/blob/branch-0.15/cpp/src_prims/sparse/csr.cuh#L117

Looks like CSR class in cuml is very similar to CsrMatrix in nvgraph.

So, we have:

  1. CsrMatrix, and siblings (DenseMatrix, LaplacianMatrix) are to be dropped from cugraph;
  2. CSR matrix does a good job in cuml;
  3. GraphCSRView, and siblings of that class (GraphCSCView, etc.) do a good job in cugraph (at least that's my understanding);

Then, what could be done in RAFT to satisfy requirements above in both cuML and cugraph, is:
(a) expose GraphCSRView, and siblings, in RAFT;
(b) add a make_CSR() factory to GraphCSRView, and siblings (perhaps ?), that could efficiently create a CSR for both cuML and the spectral clustering algorithm;
(c) similarly, add a make_Laplacian() factory to GraphCSRView to satisfy the LaplacianMatrix needs in spectral clustering;
(d) keep LaplacianMatrix in RAFT (but remove from cugraph), but only for the spectral clustering needs of RAFT (could be hidden under namespace spectral or similar);

Can we do that?

@kaatish
Copy link
Collaborator

kaatish commented May 29, 2020

It would be nice if we had an RAII abstraction that could use device buffers under the hood to manage the device memory for the underlying arrays.

@cjnolet This is provided in the GraphCSR/COO classes here

The GraphCSR classes own device buffers internally and can be used at the python level by releasing the ownership of the buffer to the python object. In cudf terms this is the column.

The GraphCSRView class can be created from a GraphCSR class that just takes the pointers of the device buffers to give a non owning abstraction for clean code. In cudf terms this is the column_view.

Whenever we allocate a graph on the C++ side and want to pass it to the python side we return a unique_ptr to GraphCSR.

@seunghwak
Copy link
Contributor

seunghwak commented May 29, 2020

Regrading the RAII class to store sparse matrices,

I'm inclined to creating a bare minimum class to store CSR/CSC like data structures. cuGraph can build a Graph class that provides more abstraction/graph specific features on top of this bare minimum class. We may later move this higher level graph class to raft if it turns out to be valuable, but I think we are too early to make that decision yet.

template <typename vertex_t, typename edge_t, typename value_t = void>
class compressed_sparse_matrix_t {
public:
  // access functions to private member variables
private:
  rmm::device_uvector<edge_t> offsets{};
  rmm::device_uvector<vertex_t> indices{};
  rmm::device_uvector<weight_t> values{};  // relevant only if std::is_same<value_t, void>::value is false
};

Then a higher level cuGraph graph class can have this class object as a member variable.

template <typename vertex_t, typename edge_t, typename weight_t = void>
class graph_t {
public:
  // provide abstraction and more graph specific features.
private:
  compressed_sparse_matrix_t<vertex_t, edge_t, weight_t> csr_data{};
  // this class may have compressed_sparse_matrix_t csc_data{} as well
  // or split vertex ranges into two, and store first range in compressed_sparse_matrix_t csr_data and the second range in doubly_compressed_sparse_matrix_t dcsr_data, at this level better hide how data are actually stored
  ...
};

@seunghwak
Copy link
Contributor

And the device_uvector PR gets merged (rapidsai/rmm#364).

We can replace rmm::device_buffer with rmm::device_uvector if we know type in compile time.

@aschaffer
Copy link
Collaborator

aschaffer commented May 29, 2020

Why not just use cugraph's GraphCSR (and alike) to own the graph objects; and the corresponding view(s), GraphCSRView, etc. GraphCSR seems very similar to cu cuML's CSR.

Actually, on 2nd thought @seunghwak 's suggestion of a bare minimum class in RAFT, which can be built on on both cuML and cugraph, might work better...

But that's probably beyond the scope of this project (spectral clustering exposure in RAFT and cleanup in cugraph of some nvgraph artifacts)

@seunghwak
Copy link
Contributor

and if we want to make our data structure ready for OPG, we may add two additional member variables vertex_t start_row_offset and std::tuple<vertex_t, vertex_t> column_range; so this class can represent any rectangular subsection of the entire matrix.

cuGraph's graph class may not be complete/stable enough to go to raft yet. Once we move this to raft and cuML gets dependent on our graph class, future updates become much more difficult. And we may add very graph specific features to this class and this may make the class unnecessarily bulky for cuML use.

@aschaffer
Copy link
Collaborator

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants