Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv #144

seunghwak · 2021-02-09T02:46:33Z

Undo temporarily exposing a RAFT communication object's private NCCL communicator.
Add device_send/device_recv (if sending or receiving), device_sendrecv (if sending and receiving), device_multicast_sendrecv (if sending and receiving multiple messages).
Add test suites for newly added raft::comms_t routines.

…in 0.18)

…_update

afender · 2021-02-18T17:08:46Z

cpp/include/raft/comms/comms.hpp

+  }
+
+  /**
+   * Performs a multicast send/receive


Just curious, is there an MPI equivalent for that?

MPI has all-to-all, but AFAIK no multicast receiving data from a subset of the nodes and sending data to another subset.

And this multicast is actually ncclSend/ncclRecv operations placed inside ncclGroupStart() & ncclGroupEnd().

afender · 2021-02-18T17:17:47Z

cpp/include/raft/comms/std_comms.hpp

+                                 std::vector<size_t> const &recvoffsets,
+                                 std::vector<int> const &sources,
+                                 cudaStream_t stream) const {
+    // ncclSend/ncclRecv pair needs to be inside ncclGroupStart/ncclGroupEnd to avoid deadlock


In practice do these transfers get serialized on the same stream? The API doesn't seem to allow the backend to run on multiple streams.
Thinking about the case where sendsizes.size() is large but sendizes[i] are relatively small. It would be interesting to see if this benefits from concurrency.

All the send/receive operations are placed inside ncclGroupStart() and ncclGroupEnd(), so AFAIK, all the send/receive operations are executed concurrently (at least logically, NCCL may or may not restrict parallelism internally to avoid congestion based on the interconnect) after ncclGroupEnd().

If you are worried about the time to queue ncclSend/ncclRecv operations (the cost of the for loops, this may become problematic if sendsizes.size() or recvsizes.size() gets very large such as millions), I am assuming that sendsizes.size() <= # of GPUs and # of GPUs may not go that large (and I guess it is a great thing if our code scales to million GPUs everywhere else and this becomes a bottleneck.... but I don't expect that will happen in foreseeable future).

BradReesWork · 2021-02-22T20:24:22Z

@gpucibot merge

@seunghwak

…icast_sendrecv, gather, gatherv) (#1391) - [x] Update cuGraph to use RAFT::comms_t's newly added device_sendrecv & device_multicast_sendrecv) - [x] Update cuGraph to use RAFT::comms_t's newly added gather & gatherv - [x] Update RAFT git tag once rapidsai/raft#114 (currently merged in 0.18 but is not merged to 0.19) and rapidsai/raft#144 are merged to 0.19 Ready for review but cannot be merged till RAFT PR 114 and 144 are merged to RAFT branch-0.19. Authors: - Seunghwa Kang (@seunghwak) Approvers: - Alex Fender (@afender) URL: #1391

seunghwak added 11 commits February 5, 2021 15:00

remove temporary get_nccl_comm function

3733238

add device_send/device_recv

d89925a

add device_sendrecv

0d080c2

add device_multicast_sendrecv

9a49d96

fix compile errors

dc101f8

add device_(multicast)_send(_or_)recv test suites

eb88acc

undo reducescatter test bug fixes (this should come from a PR merged …

cfd46c7

…in 0.18)

compile error fix

c87a0cd

fix flake8 style error

df8a50c

add const to send buffer pointer

def1b4d

fix compile errors

16da047

seunghwak mentioned this pull request Feb 9, 2021

Matching updates for RAFT comms updates (device_sendrecv, device_multicast_sendrecv, gather, gatherv) rapidsai/cugraph#1391

Merged

3 tasks

seunghwak added 3 commits February 11, 2021 12:17

Merge branch 'branch-0.19' of github.com:rapidsai/raft into fea_comms…

37401a9

…_update

clang-format

4ec520e

Merge branch 'branch-0.19' of github.com:rapidsai/raft into fea_comms…

5b708f8

…_update

afender reviewed Feb 18, 2021

View reviewed changes

afender approved these changes Feb 18, 2021

View reviewed changes

afender added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 18, 2021

rapids-bot bot merged commit a3461b2 into rapidsai:branch-0.19 Feb 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv #144

Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv #144

seunghwak commented Feb 9, 2021

afender Feb 18, 2021

seunghwak Feb 18, 2021

afender Feb 18, 2021

seunghwak Feb 18, 2021

BradReesWork commented Feb 22, 2021

Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv #144

Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv #144

Conversation

seunghwak commented Feb 9, 2021

afender Feb 18, 2021

Choose a reason for hiding this comment

seunghwak Feb 18, 2021

Choose a reason for hiding this comment

afender Feb 18, 2021

Choose a reason for hiding this comment

seunghwak Feb 18, 2021

Choose a reason for hiding this comment

BradReesWork commented Feb 22, 2021