[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

afender · 2020-05-13T21:47:57Z

Motivation
Spectral clustering is used by cuML and should be in RAFT to avoid a circular dependency. A lot of the backend building blocks can be used independently ( like lanczos solver and kmeans).
However, this is a legacy code from nvgraph that can't be transferred as is.

The following items should be taken care of :

Decouple from nvgraph structures, errors etc. These should not be transferred to RAFT.
Refactor to remove deprecated cusparse calls (next CUDA version support)
Use RAFT's cusparse wrapper
Drop the sparse matrix class.
Transfer code to RAFT
Make it deterministic.
adjust cugraph code to get Spectral Clustering building blocks from RAFT (cython and C++ API remain in cuGraph).
adjust cuML code to get Spectral Clustering from RAFT in UMAP.

The text was updated successfully, but these errors were encountered:

aschaffer · 2020-05-28T16:14:23Z

Can I get a bit more details about requirements, specifically:

"Refactor to remove deprecated cusparse calls (next CUDA version support)" I assume that's all the cublas*() trampoline functions from src/nvgraph/include/spectral_matrix.hxx, correct?
"Use RAFT's cusparse wrapper". I assume that's raft/cpp/include/raft/sparse/cusparse_wrappers.h, correct?
"Drop the sparse matrix class". I assume that's class Matrix and its derived classes, DenseMatrix, CsrMatrix, LaplacianMatrix, ModularityMatrix from src/nvgraph/include/spectral_matrix.hxx, correct?
Also, the code transfered to RAFT should be header only (templates), correct?

aschaffer · 2020-05-28T19:42:44Z

There is a strong dependency of functions in cugraph/cpp/src/nvgraph/partition.cu on various matrix classes (CsrMatrix, LaplacianMatrix). If these are to be removed (?) in RAFT, what should they be replaced with?

afender · 2020-05-29T15:23:55Z

Can I get a bit more details about requirements, specifically:

"Refactor to remove deprecated cusparse calls (next CUDA version support)" I assume that's all the cublas*() trampoline functions from src/nvgraph/include/spectral_matrix.hxx, correct?

"Use RAFT's cusparse wrapper". I assume that's raft/cpp/include/raft/sparse/cusparse_wrappers.h, correct?

"Drop the sparse matrix class". I assume that's class Matrix and its derived classes, DenseMatrix, CsrMatrix, LaplacianMatrix, ModularityMatrix from src/nvgraph/include/spectral_matrix.hxx, correct?

Also, the code transfered to RAFT should be header only (templates), correct?

Yes, your understanding is correct.
For (1), this is not limited to the cublas tampolines but all functions from the toolkit that have this message: "[[DEPRECATED]] use cusparseSpMV() instead. The routine will be removed in the next major release" See for instance https://docs.nvidia.com/cuda/cusparse/index.html#csrmv.

There is a strong dependency of functions in cugraph/cpp/src/nvgraph/partition.cu on various matrix classes (CsrMatrix, LaplacianMatrix). If these are to be removed (?) in RAFT, what should they be replaced with?

They are tightly connected to this code indeed. The dilemma is, from what I remember from these matrix classes, that they are not in good enough shape to be the standard sparse matrix classes for the rest of RAFT to build on top it. So it seems two paths can be taken:

upgrade so that we have a solid sparse matrix class in RAFT that also fits nicely in cuGraph and cuML. The goal would be to efficiently build other sparse linear algebra features on top of it in RAFT in the long run.
or
the minimum clean up to encapsulate that as part of spectral clustering only as a specific dependency.

aschaffer · 2020-05-29T15:36:13Z

How about cugraph::experimental::GraphCSRView<vertex_t, edge_t, weight_t>? Do we also need to create a separate (similar) abstraction for it in RAFT?

aschaffer · 2020-05-29T15:39:51Z

Also, @cjnolet , could you please advise on the specific needs for a sparse matrix abstraction in cuML? Thanks in advance.

cjnolet · 2020-05-29T15:44:08Z

It would be nice if we had an RAII abstraction that could use device buffers under the hood to manage the device memory for the underlying arrays. We have something very simple in cuML that has cleaned up our code tremendously by replacing the need to manage and pass 3 arrays everywhere.

Here's an example of some of the constructors we use- the port from cuml to raft should be straightforward but I'm also very much open to an abstraction that serves cugraph's needs as well.

https://github.com/rapidsai/cuml/blob/branch-0.15/cpp/src_prims/sparse/csr.cuh#L117

aschaffer · 2020-05-29T16:57:18Z

Here's an example of some of the constructors we use- the port from cuml to raft should be straightforward but I'm also very much open to an abstraction that serves cugraph's needs as well.

https://github.com/rapidsai/cuml/blob/branch-0.15/cpp/src_prims/sparse/csr.cuh#L117

Looks like CSR class in cuml is very similar to CsrMatrix in nvgraph.

So, we have:

CsrMatrix, and siblings (DenseMatrix, LaplacianMatrix) are to be dropped from cugraph;
CSR matrix does a good job in cuml;
GraphCSRView, and siblings of that class (GraphCSCView, etc.) do a good job in cugraph (at least that's my understanding);

Then, what could be done in RAFT to satisfy requirements above in both cuML and cugraph, is:
(a) expose GraphCSRView, and siblings, in RAFT;
(b) add a make_CSR() factory to GraphCSRView, and siblings (perhaps ?), that could efficiently create a CSR for both cuML and the spectral clustering algorithm;
(c) similarly, add a make_Laplacian() factory to GraphCSRView to satisfy the LaplacianMatrix needs in spectral clustering;
(d) keep LaplacianMatrix in RAFT (but remove from cugraph), but only for the spectral clustering needs of RAFT (could be hidden under namespace spectral or similar);

Can we do that?

kaatish · 2020-05-29T17:43:23Z

It would be nice if we had an RAII abstraction that could use device buffers under the hood to manage the device memory for the underlying arrays.

@cjnolet This is provided in the GraphCSR/COO classes here

The GraphCSR classes own device buffers internally and can be used at the python level by releasing the ownership of the buffer to the python object. In cudf terms this is the column.

The GraphCSRView class can be created from a GraphCSR class that just takes the pointers of the device buffers to give a non owning abstraction for clean code. In cudf terms this is the column_view.

Whenever we allocate a graph on the C++ side and want to pass it to the python side we return a unique_ptr to GraphCSR.

seunghwak · 2020-05-29T17:56:28Z

Regrading the RAII class to store sparse matrices,

I'm inclined to creating a bare minimum class to store CSR/CSC like data structures. cuGraph can build a Graph class that provides more abstraction/graph specific features on top of this bare minimum class. We may later move this higher level graph class to raft if it turns out to be valuable, but I think we are too early to make that decision yet.

template <typename vertex_t, typename edge_t, typename value_t = void>
class compressed_sparse_matrix_t {
public:
  // access functions to private member variables
private:
  rmm::device_uvector<edge_t> offsets{};
  rmm::device_uvector<vertex_t> indices{};
  rmm::device_uvector<weight_t> values{};  // relevant only if std::is_same<value_t, void>::value is false
};

Then a higher level cuGraph graph class can have this class object as a member variable.

template <typename vertex_t, typename edge_t, typename weight_t = void>
class graph_t {
public:
  // provide abstraction and more graph specific features.
private:
  compressed_sparse_matrix_t<vertex_t, edge_t, weight_t> csr_data{};
  // this class may have compressed_sparse_matrix_t csc_data{} as well
  // or split vertex ranges into two, and store first range in compressed_sparse_matrix_t csr_data and the second range in doubly_compressed_sparse_matrix_t dcsr_data, at this level better hide how data are actually stored
  ...
};

seunghwak · 2020-05-29T17:58:17Z

And the device_uvector PR gets merged (rapidsai/rmm#364).

We can replace rmm::device_buffer with rmm::device_uvector if we know type in compile time.

aschaffer · 2020-05-29T18:05:44Z

Why not just use cugraph's GraphCSR (and alike) to own the graph objects; and the corresponding view(s), GraphCSRView, etc. GraphCSR seems very similar to cu cuML's CSR.

Actually, on 2nd thought @seunghwak 's suggestion of a bare minimum class in RAFT, which can be built on on both cuML and cugraph, might work better...

But that's probably beyond the scope of this project (spectral clustering exposure in RAFT and cleanup in cugraph of some nvgraph artifacts)

seunghwak · 2020-05-29T18:17:00Z

and if we want to make our data structure ready for OPG, we may add two additional member variables vertex_t start_row_offset and std::tuple<vertex_t, vertex_t> column_range; so this class can represent any rectangular subsection of the entire matrix.

cuGraph's graph class may not be complete/stable enough to go to raft yet. Once we move this to raft and cuML gets dependent on our graph class, future updates become much more difficult. And we may add very graph specific features to this class and this may make the class unnecessarily bulky for cuML use.

aschaffer · 2020-07-27T19:22:47Z

This was resolved. Relevant PRs that were merged:

aschaffer · 2020-07-27T19:22:56Z

Closing.

afender added the ? - Needs Triage Need team to review and classify label May 13, 2020

afender added this to the 0.15 milestone May 13, 2020

afender mentioned this issue May 13, 2020

[ENH] Get rid of deprecated CUDA library calls #869

Closed

BradReesWork added Refactor and removed ? - Needs Triage Need team to review and classify labels May 20, 2020

BradReesWork assigned aschaffer May 20, 2020

aschaffer mentioned this issue May 28, 2020

[REVIEW] Spectral clustering functionality in RAFT. rapidsai/raft#12

Merged

aschaffer closed this as completed Jul 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

afender commented May 13, 2020 •

edited

Loading

aschaffer commented May 28, 2020

aschaffer commented May 28, 2020 •

edited

Loading

afender commented May 29, 2020

aschaffer commented May 29, 2020

aschaffer commented May 29, 2020

cjnolet commented May 29, 2020

aschaffer commented May 29, 2020

kaatish commented May 29, 2020 •

edited

Loading

seunghwak commented May 29, 2020 •

edited

Loading

seunghwak commented May 29, 2020

aschaffer commented May 29, 2020 •

edited

Loading

seunghwak commented May 29, 2020

aschaffer commented Jul 27, 2020

aschaffer commented Jul 27, 2020

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

[ENH] Refactor spectral clustering and transfer the backend to RAFT #868

Comments

afender commented May 13, 2020 • edited Loading

aschaffer commented May 28, 2020

aschaffer commented May 28, 2020 • edited Loading

afender commented May 29, 2020

aschaffer commented May 29, 2020

aschaffer commented May 29, 2020

cjnolet commented May 29, 2020

aschaffer commented May 29, 2020

kaatish commented May 29, 2020 • edited Loading

seunghwak commented May 29, 2020 • edited Loading

seunghwak commented May 29, 2020

aschaffer commented May 29, 2020 • edited Loading

seunghwak commented May 29, 2020

aschaffer commented Jul 27, 2020

aschaffer commented Jul 27, 2020

afender commented May 13, 2020 •

edited

Loading

aschaffer commented May 28, 2020 •

edited

Loading

kaatish commented May 29, 2020 •

edited

Loading

seunghwak commented May 29, 2020 •

edited

Loading

aschaffer commented May 29, 2020 •

edited

Loading