Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparing sparse primitives for movement to RAFT #3157

Merged
merged 152 commits into from
Jan 16, 2021

Conversation

cjnolet
Copy link
Member

@cjnolet cjnolet commented Nov 18, 2020

This PR contains the initial steps to move many of the sparse prims API over to raft, including:

  • Adjusting MLCommon::Sparse namespaces to raft::sparse
  • Breaking csr/coo prims into multiple files (e.g. linalg, components, matrix, etc...)
  • Using RAFT namespaces for raft componentry used within the sparse prims, such as device_buffer and deviceAllocator.
  • Use RAFT handle in public API
    Closes [FEA] Move sparse prims to RAFT #3106

cjnolet added 30 commits July 24, 2020 15:05
…_knn

Conflicts:
	cpp/src_prims/sparse/cusparse_wrappers.h
…version seems super expensive, but maybe it's necessary.
@cjnolet
Copy link
Member Author

cjnolet commented Dec 17, 2020

@divyegala This is ready for re-review when you get a moment. No rush!

I started scraping through the prims to make them all use raft::handle_t but I think this could be done more on a case-by-case basis as necessary. I personally wouldn't mind seeing more consistency across the prims in RAFT, but many of the current dense prims don't take a handle_t yet either.

@cjnolet cjnolet added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 2 - In Progress Currenty a work in progress labels Dec 17, 2020
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I like the breaking of sparse tests

@divyegala
Copy link
Member

@cjnolet dense prims in RAFT are following need-to-know for the handle. So, if the prim needs a handle it takes one and otherwise not

@cjnolet
Copy link
Member Author

cjnolet commented Dec 22, 2020

@divyegala Thanks! Case-by-case sounds fine to me, so long as the arguments aren't creating redundancy that will lead to unexpected side-effects (e.g. by taking both a handle and a separate stream). I'll submit a PR to clean those up at some point.

@divyegala
Copy link
Member

@cjnolet @teju85 suggested that we still keep a separate stream parameter, so that dev can decide which stream to run the prim on

@cjnolet
Copy link
Member Author

cjnolet commented Dec 22, 2020

@teju85, @divyegala In that case, we should continue the discussion about this publicly (and after we return from the holidays, please). I'd like to avoid the cases where the streams are used inconsistently.

@cjnolet cjnolet added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Dec 22, 2020
@teju85
Copy link
Member

teju85 commented Dec 22, 2020

At primitives level, we should certainly have everyone of it accept a separate stream parameter in its args-list, for maximum composability, because it's possible that the caller wants to call this prim with a separate stream other than found in handle_t::get_stream(). Sure, I'm happy to discuss about this after holidays.

@JohnZed JohnZed added 6 - Okay to Auto-Merge and removed 5 - Ready to Merge Testing and reviews complete, ready to merge labels Jan 14, 2021
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment which I am okay with having an issue created for later update once this PR goes in! Otherwise, LGTM, from my previous review

cpp/src_prims/sparse/linalg/symmetrize.cuh Outdated Show resolved Hide resolved
@cjnolet
Copy link
Member Author

cjnolet commented Jan 15, 2021

rerun tests

@rapids-bot rapids-bot bot merged commit d72c54a into rapidsai:branch-0.18 Jan 16, 2021
@Nyrio Nyrio mentioned this pull request Jan 18, 2021
rapids-bot bot pushed a commit that referenced this pull request Feb 2, 2021
This Pull Request adds initial support for multi-node multi-GPU DBSCAN, and fixes the bugs identified in #3094.

It works by copying the dataset on all the workers and giving ownership of a subset of points to each one. The workers compute a partial clustering with the knowledge of the relationships between their points and the rest of the dataset, and the partial clusterings are merged to form the final labeling. This merging algorithm is also used to accumulate the results in case a batch-wise approach is used on a worker to limit the memory consumption.

The multi-GPU implementation gives great speedups for large datasets, while for small datasets the performance is dominated by the Dask launch overhead, as shown in the figure below:

![mnmg_dbscan_perf](https://user-images.githubusercontent.com/17441062/104958437-55a6da80-59d0-11eb-8a18-fcca0d69c41b.png)

Notes:

- I have renamed variables in the DBSCAN implementation to match our style conventions (snake case). Sorry for the noise that it adds to this PR.
- I refactored some CSR tests to accept multiple test cases instead of hardcoded ones, in order to add corner cases to weak CC. PR #3157 by @cjnolet changed the location of these tests, so I moved those that I had already refactored accordingly. At the moment only the tests that were in `cpp/test/prims/csr.cu` previously have been refactored. I was thinking that the others can be refactored later, I'd like @cjnolet's opinion on this refactoring.
- Regarding testing, the MNMG tests are mostly a copy of the single-GPU ones, though I removed a few tests with very small datasets to avoid problems with MNMG (it doesn't really support the edge case where a worker owns 0 sample, as I think it's a fair assumption that MNMG DBSCAN isn't used with such a tiny dataset).
- Also regarding tests, I changed the comparison function to account for the fact that border points are ambiguous. It assumes that the labeling of core points is minimal in both our implementation and the reference, so if this assumption changes we will need to update the tests accordingly.

If you want to access a pseudo-code description and proof of the new algorithm, feel free to contact me.

Tagging people to whom this PR is relevant: @teju85 @tfeher @MatthiasKohl @canonizer

Authors:
  - Louis Sugy (@Nyrio)

Approvers:
  - Tamas Bela Feher (@tfeher)
  - Corey J. Nolet (@cjnolet)

URL: #3382
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CUDA / C++ CUDA issue improvement Improvement / enhancement to an existing function libcuml non-breaking Non-breaking change RAFT Issue or PR is related to RAFT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Move sparse prims to RAFT
7 participants