UCT/CUDA: Make cuda_ipc cache global #5815

Akshay-Venkatesh · 2020-10-20T00:40:13Z

What/Why

Move IPC cache from being a per-endpoint entity to per-iface entity
This allows multiple endpoints from the same process to reuse peer mapped memory and not avoid reopening already opened memory handles
- needed for cuda < 11.1 to avoid fatal error when trying to re-open already opened memory handle

How ?

By using a cache per remote process
By using a hashtable of remote caches identified by remote pid

swx-jenkins3 · 2020-10-20T00:40:24Z

Can one of the admins verify this patch?

Akshay-Venkatesh · 2020-10-20T00:41:11Z

cc @bureddy @yosefe Can you provide a review when possible?

cc @petro-rudenko @pentschev Can you try this PR and see if your issues with multiple UCP endpoints get resolved?

Akshay-Venkatesh · 2020-10-20T14:03:01Z

Errors don't look related.

bureddy · 2020-10-20T15:49:55Z

src/uct/cuda/cuda_ipc/cuda_ipc_iface.h

+        khash_t(cuda_ipc_rem_cache)
+                     hash;
+        ucs_recursive_spinlock_t
+                     lock;


may be name can be on the same line?

I did this to align members on the same column. If I move everything to the same line, I would need to move all other members which is probably fine but I wanted to keep changes small. For now, I'll go ahead and make the change.

bureddy · 2020-10-20T15:57:57Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        *cache = kh_val(&iface->remote_cache.hash, khiter);
+    } else {
+        /* if UCS_KH_PUT_BUCKET_EMPTY or UCS_KH_PUT_BUCKET_CLEAR */
+


pls remove empty line

bureddy · 2020-10-20T15:58:47Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        status = uct_cuda_ipc_create_cache(cache, target_name);
+        if (status != UCS_OK) {
+            ucs_error("could not create create cuda ipc cache: %s for pid %d",
+                      ucs_status_string(status), pid);


no unlock before return. may be you can have unlock: label at the end and use goto

bureddy · 2020-10-20T16:06:06Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        ucs_error("unable to use cuda_ipc remote_cache hash");
+        ucs_recursive_spin_unlock(&iface->remote_cache.lock);
+        return UCS_ERR_NO_RESOURCE;
+    } else if (khret == UCS_KH_PUT_KEY_PRESENT) {


move this most frequent "if" condition to above

bureddy · 2020-10-20T16:06:41Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+
+    ucs_recursive_spin_lock(&iface->remote_cache.lock);
+
+    khiter = kh_put(cuda_ipc_rem_cache, &iface->remote_cache.hash, pid,


is kh_put better than kh_get to check if the key is present or not?

Using kh_put simplifies the flow here as we don't have to explicitly handle different cases. kh_put can involve resize operation but that's a cost we have to incur and handle at some point even if we are just using kh_get. kh_put seems to additionally check if an element was deleted which is not particularly useful for us other than knowing that the key isn't present. With these tradeoffs I don't know if there is a big difference between the two. @yosefe any suggestions?

kh_get is better but not by much, since we indent to create a key if it doesn't exist we could use kh_put for simplicity

pentschev · 2020-10-20T16:35:17Z

I can confirm this resolves the issue on the case we have with UCX-Py+Dask, tested with CUDA 10.2. Thanks a lot for the fix @Akshay-Venkatesh !

cc @quasiben for awareness.

brminich · 2020-10-22T18:44:51Z

ok to test

brminich · 2020-10-22T18:44:58Z

/azp run

azure-pipelines · 2020-10-22T18:45:13Z

Azure Pipelines successfully started running 1 pipeline(s).

brminich · 2020-10-22T18:50:32Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        if (status != UCS_OK) {
+            ucs_error("could not create create cuda ipc cache: %s for pid %d",
+                      ucs_status_string(status), pid);
+            goto err;


error handling seems to be incorrect. If you print error and return here, the following call to uct_cuda_ipc_get_rem_cache with the same pid will find the uninitialized value in the hash.
Probably need to delete the element here or use kh_get instead

Thanks @brminich

brminich · 2020-10-22T18:52:21Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.h

@@ -46,6 +46,7 @@ typedef struct uct_cuda_ipc_md_config {
 */
 typedef struct uct_cuda_ipc_key {
    CUipcMemHandle ph;           /* Memory handle of GPU memory */
+    pid_t          pid;          /* PID as key to resolve peer_map hash*/


please add space after hash

petro-rudenko · 2020-10-26T12:36:12Z

Still got [swx-dgx02:76439:0:78030] cuda_ipc_cache.c:263 Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000000 len:306807 56736 (Element already exists):

#0  0x00007ff26726eefd in pause () from /lib64/libpthread.so.0
#1  0x00007fe500c9b94d in ucs_debug_freeze () at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:820
#2  0x00007fe500c9bd34 in ucs_error_freeze (message=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000
000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:915#3  0x00007fe500c9c34e in ucs_handle_error (message=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000
000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:1078
#4  0x00007fe500c98b5c in ucs_fatal_error_message (file=0x7fe50008aad0 "/hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c", line=263, function=0x7fe50008b0e0 <__FUNCTION__.15748> "uct_cuda_ipc_map_memhandle_inner", messa
ge_buf=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/assert.c:37
#5  0x00007fe500c98cdc in ucs_fatal_error_format (file=0x7fe50008aad0 "/hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/
uct/cuda/cuda_ipc/cuda_ipc_cache.c", line=263, function=0x7fe50008b0e0 <__FUNCTION__.15748> "uct_cuda_ipc_map_memhandle_inner", format=0x7fe50008adf8 "Fatal: %s: failed to open ipc mem handle. addr:%p len:%lu (%s)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/c
ontrib/../src/ucs/debug/assert.c:53
#6  0x00007fe50008817c in uct_cuda_ipc_map_memhandle_inner (mapped_addr=0x7fe501bf86c0, key=0x7fd9cc6066e0, arg=0x7fd9cc742af0) at /hp
c/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c:260
#7  uct_cuda_ipc_map_memhandle (arg=0x7fd9cc742af0, key=0x7fd9cc6066e0, mapped_addr=0x7fe501bf86c0) at /hpc/mtr_scrap/users/peterr/rap
ids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c:193
#8  0x00007fe500085169 in uct_cuda_ipc_post_cuda_async_copy (direction=1, comp=0x7fd9cc2b1528, rkey=140573413500640, iov=0x7fe501bf885
0, remote_addr=139926485898496, tl_ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_
ipc/cuda_ipc_ep.c:68
#9  uct_cuda_ipc_ep_get_zcopy_inner (comp=0x7fd9cc2b1528, rkey=140573413500640, remote_addr=139926485898496, iovcnt=1, iov=0x7fe501bf8
850, tl_ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_ep.c:135
#10 uct_cuda_ipc_ep_get_zcopy (tl_ep=0x7fd9cd011ed0, iov=0x7fe501bf8850, iovcnt=1, remote_addr=139926485898496, rkey=140573413500640,
comp=0x7fd9cc2b1528) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_ep.c:127
#11 0x00007fe5005156ac in uct_ep_get_zcopy (comp=0x7fd9cc2b1528, rkey=140573413500640, remote_addr=139926485898496, iovcnt=1, iov=0x7f
e501bf8850, ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/api/uct.h:2598
#12 ucp_rndv_progress_rma_get_zcopy_inner (self=0x7fd9cc2b1540) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/
rndv/rndv.c:537
#13 ucp_rndv_progress_rma_get_zcopy (self=0x7fd9cc2b1540) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/r
ndv.c:452
#14 0x00007fe5005169df in ucp_request_try_send (pending_flags=0, req=0x7fd9cc2b1440) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ip
c/contrib/../src/ucp/core/ucp_request.inl:239
#15 ucp_request_send (pending_flags=0, req=0x7fd9cc2b1440) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/core/
ucp_request.inl:264
#16 ucp_rndv_req_send_rma_get (rndv_req=0x7fd9cc2b1440, rreq=0x7fbc795d09c0, rndv_rts_hdr=0x7fe4d133b490, rkey_buf=0x7fe4d133b4ba) at
/hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:714
#17 0x00007fe50051b472 in ucp_rndv_receive_inner (rkey_buf=0x7fe4d133b4ba, rndv_rts_hdr=0x7fe4d133b490, rreq=0x7fbc795d09c0, worker=0x
7fd9cc735410) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1227
#18 ucp_rndv_receive (worker=0x7fd9cc735410, rreq=0x7fbc795d09c0, rndv_rts_hdr=0x7fe4d133b490, rkey_buf=0x7fe4d133b4ba) at /hpc/mtr_sc
rap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1163
#19 0x00007fe50055c05d in ucp_tag_rndv_matched (worker=0x7fd9cc735410, rreq=0x7fbc795d09c0, rts_hdr=0x7fe4d133b490) at /hpc/mtr_scrap/
users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/tag/tag_rndv.c:24
#20 0x00007fe50055c4b4 in ucp_tag_rndv_process_rts (worker=0x7fd9cc735410, common_rts_hdr=0x7fe4d133b490, length=205, tl_flags=1) at /
hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/tag/tag_rndv.c:42
#21 0x00007fe50051b88d in ucp_rndv_rts_handler_inner (tl_flags=1, length=205, data=0x7fe4d133b490, arg=0x7fd9cc735410) at /hpc/mtr_scr
ap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1293
#22 ucp_rndv_rts_handler (arg=0x7fd9cc735410, data=0x7fe4d133b490, length=205, tl_flags=1) at /hpc/mtr_scrap/users/peterr/rapids/ucx-c
uda-ipc/contrib/../src/ucp/rndv/rndv.c:1285
#23 0x00007fe5007f8062 in uct_iface_invoke_am (iface=0x7fd9cc740340, id=9 '\t', data=0x7fe4d133b490, length=205, flags=1) at /hpc/mtr_
scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/base/uct_iface.h:654
#24 0x00007fe5007f8afe in uct_mm_iface_invoke_am (flags=1, length=205, data=0x7fe4d133b490, am_id=9 '\t', iface=0x7fd9cc740340) at /hp
c/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/mm_iface.h:245
#25 uct_mm_iface_process_recv (elem=0x7fe4e816ed40, iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/.
./src/uct/sm/mm/base/mm_iface.c:250
#26 uct_mm_iface_poll_fifo (iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/mm_
iface.c:282
#27 uct_mm_iface_progress (tl_iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/m
m_iface.c:335
#28 0x00007fe5004df027 in ucs_callbackq_dispatch (cbq=0x7fd9cceb1dd0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../sr
c/ucs/datastruct/callbackq.h:211
#29 0x00007fe5004e8f18 in uct_worker_progress (worker=0x7fd9cceb1dd0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../sr
c/uct/api/uct.h:2406
#30 ucp_worker_progress (worker=0x7fd9cc735410) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/core/ucp_worker.
c:2308

Akshay-Venkatesh · 2020-10-26T12:54:56Z

Peter, Can you provide the repro instructions? Is there's a standalone, that works best.

…

On Mon, Oct 26, 2020, 8:36 AM Peter Rudenko ***@***.***> wrote: Still got [swx-dgx02:76439:0:78030] cuda_ipc_cache.c:263 Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000000 len:306807 56736 (Element already exists): #0 0x00007ff26726eefd in pause () from /lib64/libpthread.so.0 #1 0x00007fe500c9b94d in ucs_debug_freeze () at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:820 #2 0x00007fe500c9bd34 in ucs_error_freeze (message=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000 000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:915#3 0x00007fe500c9c34e in ucs_handle_error (message=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000 000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/debug.c:1078 #4 0x00007fe500c98b5c in ucs_fatal_error_message (file=0x7fe50008aad0 "/hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c", line=263, function=0x7fe50008b0e0 <__FUNCTION__.15748> "uct_cuda_ipc_map_memhandle_inner", messa ge_buf=0x7fe501bf8040 "Fatal: dest:76435: failed to open ipc mem handle. addr:0x7f41ca000000 len:30680756736 (Element already exists)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucs/debug/assert.c:37 #5 0x00007fe500c98cdc in ucs_fatal_error_format (file=0x7fe50008aad0 "/hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ uct/cuda/cuda_ipc/cuda_ipc_cache.c", line=263, function=0x7fe50008b0e0 <__FUNCTION__.15748> "uct_cuda_ipc_map_memhandle_inner", format=0x7fe50008adf8 "Fatal: %s: failed to open ipc mem handle. addr:%p len:%lu (%s)") at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/c ontrib/../src/ucs/debug/assert.c:53 #6 0x00007fe50008817c in uct_cuda_ipc_map_memhandle_inner (mapped_addr=0x7fe501bf86c0, key=0x7fd9cc6066e0, arg=0x7fd9cc742af0) at /hp c/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c:260 #7 uct_cuda_ipc_map_memhandle (arg=0x7fd9cc742af0, key=0x7fd9cc6066e0, mapped_addr=0x7fe501bf86c0) at /hpc/mtr_scrap/users/peterr/rap ids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_cache.c:193 #8 0x00007fe500085169 in uct_cuda_ipc_post_cuda_async_copy (direction=1, comp=0x7fd9cc2b1528, rkey=140573413500640, iov=0x7fe501bf885 0, remote_addr=139926485898496, tl_ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ ipc/cuda_ipc_ep.c:68 #9 uct_cuda_ipc_ep_get_zcopy_inner (comp=0x7fd9cc2b1528, rkey=140573413500640, remote_addr=139926485898496, iovcnt=1, iov=0x7fe501bf8 850, tl_ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_ep.c:135 #10 uct_cuda_ipc_ep_get_zcopy (tl_ep=0x7fd9cd011ed0, iov=0x7fe501bf8850, iovcnt=1, remote_addr=139926485898496, rkey=140573413500640, comp=0x7fd9cc2b1528) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_ep.c:127 #11 0x00007fe5005156ac in uct_ep_get_zcopy (comp=0x7fd9cc2b1528, rkey=140573413500640, remote_addr=139926485898496, iovcnt=1, iov=0x7f e501bf8850, ep=0x7fd9cd011ed0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/api/uct.h:2598 #12 ucp_rndv_progress_rma_get_zcopy_inner (self=0x7fd9cc2b1540) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/ rndv/rndv.c:537 #13 ucp_rndv_progress_rma_get_zcopy (self=0x7fd9cc2b1540) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/r ndv.c:452 #14 0x00007fe5005169df in ucp_request_try_send (pending_flags=0, req=0x7fd9cc2b1440) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ip c/contrib/../src/ucp/core/ucp_request.inl:239 #15 ucp_request_send (pending_flags=0, req=0x7fd9cc2b1440) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/core/ ucp_request.inl:264 #16 ucp_rndv_req_send_rma_get (rndv_req=0x7fd9cc2b1440, rreq=0x7fbc795d09c0, rndv_rts_hdr=0x7fe4d133b490, rkey_buf=0x7fe4d133b4ba) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:714 #17 0x00007fe50051b472 in ucp_rndv_receive_inner (rkey_buf=0x7fe4d133b4ba, rndv_rts_hdr=0x7fe4d133b490, rreq=0x7fbc795d09c0, worker=0x 7fd9cc735410) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1227 #18 ucp_rndv_receive (worker=0x7fd9cc735410, rreq=0x7fbc795d09c0, rndv_rts_hdr=0x7fe4d133b490, rkey_buf=0x7fe4d133b4ba) at /hpc/mtr_sc rap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1163 #19 0x00007fe50055c05d in ucp_tag_rndv_matched (worker=0x7fd9cc735410, rreq=0x7fbc795d09c0, rts_hdr=0x7fe4d133b490) at /hpc/mtr_scrap/ users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/tag/tag_rndv.c:24 #20 0x00007fe50055c4b4 in ucp_tag_rndv_process_rts (worker=0x7fd9cc735410, common_rts_hdr=0x7fe4d133b490, length=205, tl_flags=1) at / hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/tag/tag_rndv.c:42 #21 0x00007fe50051b88d in ucp_rndv_rts_handler_inner (tl_flags=1, length=205, data=0x7fe4d133b490, arg=0x7fd9cc735410) at /hpc/mtr_scr ap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/rndv/rndv.c:1293 #22 ucp_rndv_rts_handler (arg=0x7fd9cc735410, data=0x7fe4d133b490, length=205, tl_flags=1) at /hpc/mtr_scrap/users/peterr/rapids/ucx-c uda-ipc/contrib/../src/ucp/rndv/rndv.c:1285 #23 0x00007fe5007f8062 in uct_iface_invoke_am (iface=0x7fd9cc740340, id=9 '\t', data=0x7fe4d133b490, length=205, flags=1) at /hpc/mtr_ scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/base/uct_iface.h:654 #24 0x00007fe5007f8afe in uct_mm_iface_invoke_am (flags=1, length=205, data=0x7fe4d133b490, am_id=9 '\t', iface=0x7fd9cc740340) at /hp c/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/mm_iface.h:245 #25 uct_mm_iface_process_recv (elem=0x7fe4e816ed40, iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/. ./src/uct/sm/mm/base/mm_iface.c:250 #26 uct_mm_iface_poll_fifo (iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/mm_ iface.c:282 #27 uct_mm_iface_progress (tl_iface=0x7fd9cc740340) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/uct/sm/mm/base/m m_iface.c:335 #28 0x00007fe5004df027 in ucs_callbackq_dispatch (cbq=0x7fd9cceb1dd0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../sr c/ucs/datastruct/callbackq.h:211 #29 0x00007fe5004e8f18 in uct_worker_progress (worker=0x7fd9cceb1dd0) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../sr c/uct/api/uct.h:2406 #30 ucp_worker_progress (worker=0x7fd9cc735410) at /hpc/mtr_scrap/users/peterr/rapids/ucx-cuda-ipc/contrib/../src/ucp/core/ucp_worker. c:2308 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5815 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYNWASA3QPYO24L7YHBIP3SMVUMVANCNFSM4SXDDGOA> .

petro-rudenko · 2020-10-26T13:02:10Z

The app is SparkUCX but you can reproduce it with multithreaded perftest:
CUDA_VISIBLE_DEVICES=0 UCX_NET_DEVICES=mlx5_0:1 ./ucx-cuda-ipc/build/bin/ucx_perftest -T 10 -m cuda -s 2000000 -t tag_bw swx-dgx01

Akshay-Venkatesh · 2020-10-27T16:48:36Z

@petro-rudenko Can you try again?

@pentschev Can you confirm that your case still works with the new changes?

petro-rudenko · 2020-10-27T18:19:41Z

@Akshay-Venkatesh thanks, working now

pentschev · 2020-10-27T20:56:23Z

@Akshay-Venkatesh thanks for the ping, I confirm it still works for us on 9dbfb20 .

hoopoepg · 2020-11-05T08:22:36Z

this PR resolves issue:

[1604562797.006386] [prm-dgx-35:26200:0] cuda_ipc_cache.c:57   UCX  ERROR cuIpcCloseMemHandle((CUdeviceptr)region->mapped_addr)() failed: context is destroyed
[1604562797.007714] [prm-dgx-35:26199:0] cuda_ipc_cache.c:57   UCX  ERROR cuIpcCloseMemHandle((CUdeviceptr)region->mapped_addr)() failed: context is destroyed

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

yosefe · 2020-11-08T17:51:29Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+            status = UCS_ERR_ALREADY_EXISTS;
+            goto out;


need to flatten these if:
if (cuerr== SUCCESS || cuerr == ALREADY_EXISTS) {
status == UCS_OK
}else{
ucs_error()
status = INVALID_PARAM
}

@yosefe status is UCS_OK by default in this function; we cannot return UCS_OK when cuerr == ALREADY_EXISTS because functions which call uct_cuda_ipc_open_memhandle handle UCS_ERR_ALREADY_EXISTS differently from handling UCS_INVALID_PARAM (fatal error) which is different from handling UCS_OK. So I'm not sure how we can flatten to :

if (cuerr== SUCCESS || cuerr == ALREADY_EXISTS) { status == UCS_OK; } else { status == UCS_INVALID_PARAM; }

can we flatten in a different way?

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

src/uct/cuda/cuda_ipc/cuda_ipc_cache.h

yosefe · 2020-11-08T17:56:55Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

    ucs_pgt_region_t *pgt_region;
    uct_cuda_ipc_cache_region_t *region;

+    status = uct_cuda_ipc_get_rem_cache(pid, &cache);


why do we remove remote cache entry?

Sorry. rem = remote here. I've changed all instances to remove ambiguity.

bureddy · 2020-11-18T20:36:26Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+    } else if (khret == UCS_KH_PUT_KEY_PRESENT) {
+        *cache = kh_val(&uct_cuda_ipc_remote_cache.hash, khiter);
+    } else {
+        /* if (khret == UCS_KH_PUT_FAILED) */


should we remove or modify or make an assert? instead of "if" the condition

Asserts get compiled out if it's not a debug build, right? If KH_PUT_FAILED occurs then one option is to have ucs_fatal error but maybe don't want to handle the failure here but at a higher layer.. If we don't fail here then ep_put/get_zcopy which requests memory mapping would fail and maybe UCP is in a better position to handle the failure (say by switching to another protocol).

What happens today if ep_put/get_zcopy fails during rndv? Do we still try am_bcopy protocol as a fallback?

I meant to just change the comment here with wording instead of coding like comment

Missed that. I'll remove the comment now.

Akshay-Venkatesh · 2020-11-23T14:28:06Z

@yosefe Can you lmk if your comments have been addressed when you get a chance?

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

yosefe · 2020-11-23T16:58:42Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

-    ucs_recursive_spinlock_init(&uct_cuda_ipc_map_lock, 0);
-    ucs_recursive_spinlock_init(&remote_cache.lock, 0);
-    kh_init_inplace(cuda_ipc_rem_cache, &remote_cache.hash);
+    ucs_recursive_spinlock_init(&uct_cuda_ipc_remote_cache.lock, 0);


what is the difference between uct_cuda_ipc_map_lock and uct_cuda_ipc_remote_cache.lock ?

uct_cuda_ipc_map_lock is to have mutual exclusion for threads trying open an IPC memory handle whereas uct_cuda_ipc_remote_cache.lock is used to ensure that no two threads try to access the hashtable of cuda_ipc cache structures created per remote process per local cuda device.

why do we need uct_cuda_ipc_map_lock ? isn't cuda API already thread safe?

@Akshay-Venkatesh we dont need uct_cuda_ipc_map_lock() right? driver seems already takes a cuda context lock

@bureddy @yosefe It's safe for multiple threads to call cuIpcOpenMemHandle. We still want to avoid the case where two threads end up checking for reachability and end up causing "memory handle already opened error". For this we need mutual exclusion on the segment of code that 1. opens the remote memory handle 2. updates peer_accessibility table 3. closes the memory handle. For this reason when need to use a lock around the above segment but it does seem like the same lock is not needed here because by that time the thread is guaranteed to have checked for reachability. I've removed the lock use in uct_cuda_ipc_open_memhandle for this reason.

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

src/uct/cuda/cuda_ipc/cuda_ipc_cache.h

Akshay-Venkatesh · 2020-12-02T15:06:46Z

@yosefe Have I addressed all your comments?

yosefe · 2020-11-23T17:17:42Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+            status = UCS_ERR_ALREADY_EXISTS;
+            goto out;


can we flatten in a different way?

yosefe · 2020-12-04T15:39:57Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+    ucs_recursive_spin_lock(&uct_cuda_ipc_remote_cache.lock);
+
+    key.pid = pid;
+


remove space line

yosefe · 2020-12-04T15:42:40Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

-    ucs_recursive_spinlock_init(&uct_cuda_ipc_map_lock, 0);
-    ucs_recursive_spinlock_init(&remote_cache.lock, 0);
-    kh_init_inplace(cuda_ipc_rem_cache, &remote_cache.hash);
+    ucs_recursive_spinlock_init(&uct_cuda_ipc_remote_cache.lock, 0);


why do we need uct_cuda_ipc_map_lock ? isn't cuda API already thread safe?

yosefe · 2020-12-04T15:44:56Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+
+err:
+    ucs_recursive_spin_unlock(&uct_cuda_ipc_remote_cache.lock);
+


remove space line

pls remove space line

src/uct/cuda/cuda_ipc/cuda_ipc_cache.h

yosefe · 2020-12-04T15:48:12Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.h

@@ -37,15 +67,21 @@ struct uct_cuda_ipc_cache {
 };


+typedef struct uct_cuda_ipc_remote_cache {
+    khash_t(cuda_ipc_rem_cache) hash;
+    ucs_recursive_spinlock_t    lock;


do we need a lock both in "struct uct_cuda_ipc_cache" and also in here?
maybe can remove lock from uct_cuda_ipc_cache ?

The attempt is to ensure that

no two threads try to update a single rcache instance using uct_cuda_ipc_cache

no two threads try to update the hashtable for storing different rcache instances (one per per GPU per remote process) using remote_cache.lock.

These two operations are independent so grabbing the locks for each should be independent too. I.e if a single thread decides to grab rache x for peer gpu x, then another thread should be allowed to grab rache y for peer gpu y. But two threads shouldn't be allowed to access rcache x simultaneously (at least for now we don't distinguish accessing rcache for read-only and for reads/writes)

@yosefe when possible can we revisit this? I've addressed concern around ipc_map_lock here #5815 (comment)

yosefe · 2020-12-04T15:48:49Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

@@ -153,7 +158,13 @@ static ucs_status_t uct_cuda_ipc_is_peer_accessible(uct_cuda_ipc_component_t *md
        }
    }

+    pthread_mutex_unlock(&uct_cuda_ipc_map_lock);
+


remove space line

Akshay-Venkatesh · 2020-12-06T01:31:07Z

@yosefe For this #5815 (comment), cuda calls are but I'm using the ipc_map_lock to avoid the case where one thread that does reachability check accidentally closes a memory handle that was opened by another thread for the purposes of transfer.

yosefe · 2020-12-15T15:42:06Z

/azp run

azure-pipelines · 2020-12-15T15:42:19Z

Azure Pipelines successfully started running 1 pipeline(s).

yosefe

pls check codes style issues https://dev.azure.com/ucfconsort/ucx/_build/results?buildId=12985&view=logs&jobId=cc064a77-22b5-56bf-ecc0-70b5fe764261&j=cc064a77-22b5-56bf-ecc0-70b5fe764261&t=aedfd754-44a6-53e6-6843-8659d800fee2

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

yosefe · 2021-01-29T16:52:30Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+
+err:
+    ucs_recursive_spin_unlock(&uct_cuda_ipc_remote_cache.lock);
+


pls remove space line

yosefe · 2021-01-29T16:56:09Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

@@ -127,18 +128,22 @@ static ucs_status_t uct_cuda_ipc_is_peer_accessible(uct_cuda_ipc_component_t *md
    char* accessible;
    CUdeviceptr d_mapped;

+    pthread_mutex_lock(&uct_cuda_ipc_map_lock);


looks like this lock is protecting mdc->md->peer_accessible_cache

it does not seem related to this PR

it should be on the MD object

That's correct. It's not related to this PR but it is still needed. Hope it's ok to have this as part of this PR.

I've moved the lock into the MD object.

maybe make it spinlock and reduce lock scope to only updating the array (move cuda API calls outside of lock section)

Switched to spinlock and reduced scope. If the array element is in init state (0xFF), then we want a thread to query CUDA API and update this. This write needs to be in the lock section. Also, we want at most one thread to call cuIpcOpenMemHandle and update the array element value from init to reachable or unreachable. We can't have have multiple threads calling cuIpcOpenMemHandle because that can lead to "already opened" error. With these two constraints, I couldn't think of a way to move cuda API out of the lock section.

since we probably don't care if more than one thread would update "accessible" at the same time, we don't even need a lock.
since single-value memory read/write is atomic (e.g will get one of the values previously written, and not some garbage), can do it like this:

ipc_md { ucs_ternary_value_t accessible[]; }; ... uct_cuda_ipc_is_peer_accessible() { ucs_ternary_value_t is_accessible = *accessible; if (is_accessible == UCS_TRY) { cuIpc(...) is_accessible = (result == ...) ? UCS_YES : UCS_NO; *accessible = is_accessible; } return is_accessible==UCS_YES;

since we probably don't care if more than one thread would update "accessible" at the same time, we don't even need a lock.

Agreed. We don't need lock on accessible array.

We still need no more than one thread call cuIpcOpenMemhandle. The following code doesn't prevent that.

uct_cuda_ipc_is_peer_accessible() { ... ucs_ternary_value_t is_accessible = *accessible; if (is_accessible == UCS_TRY) { ... }

Two threads could end up calling OpenMemHandle on the same memory handle if they both see is_accessible as UCS_TRY, right? We may need all of the following executed in atomic fashion to disallow that:

if (is_accessible == UCS_TRY) { cuIpc(...) is_accessible = (result == ...) ? UCS_YES : UCS_NO; *accessible = is_accessible; }

Don't we need a lock for this? Lmk if I've misunderstood something here.

Why 2 threads opening the same handle is a problem?
seems we treat CUDA_ERROR_ALREADY_MAPPED staus as success

@yosefe you're right. I ignored that aspect. This reminded me of another motivation for adding the lock which was to avoid opening the memory handle multiple times from the perspective of each thread because it can be in the order of ms. Even though opening the memory handle is expensive it's probably still ok to not have a lock because we would have accessible element set to UCS_YES soon that not many threads will end up opening the handle anyway. I'll make changes based on your code suggestion above.

yosefe · 2021-01-29T16:56:55Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        *cache = kh_val(&iface->remote_cache.hash, khiter);
+    } else {
+        /* if UCS_KH_PUT_BUCKET_EMPTY or UCS_KH_PUT_BUCKET_CLEAR */
+


pls remove empty line

yosefe · 2021-01-29T16:57:14Z

src/uct/cuda/cuda_ipc/cuda_ipc_cache.c

+        ucs_error("unable to use cuda_ipc remote_cache hash");
+        status = UCS_ERR_NO_RESOURCE;
+    }
+err:


err -> err_unlock

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

yosefe · 2021-01-31T19:52:23Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

+    uct_cuda_ipc_memset(md->peer_accessible_cache, UCS_TRY,
+                        num_devices * md->uuid_map_capacity,
+                        sizeof(ucs_ternary_value_t));


pass ucs_ternary_value_t* and length to this function
and rename it to: uct_cuda_ipc_accessible_cache_init
BTW maybe can initialize this cache to NULL/capacity=0, and let the uct_cuda_ipc_get_unique_index_for_uuid function allocate in on first use?

@yosefe I've addressed the above two suggestions. I also ended up calling cuda_ipc cache insertion as part of reachability check. This saves us on one OpenMemHandle call and also to prevent a multi-threaded issue that I've documented. Lmk if the changes look ok.

Also, I'm not sure why I'm seeing these errors on CI but not locally.

In file included from /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:11:0: /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.h:23:5: error: unknown type name 'ucs_ternary_value_t' ucs_ternary_value_t *peer_accessible_cache; ^ /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c: In function 'uct_cuda_ipc_accessible_cache_init': /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:77:28: error: 'ucs_ternary_value_t' undeclared (first use in this function) size_t offset = sizeof(ucs_ternary_value_t); ^ /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:77:28: note: each undeclared identifier is reported only once for each function it appears in /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:78:26: error: 'p' undeclared (first use in this function) ucs_ternary_value_t *p; ^ /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:79:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] size_t i; ^ /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c: In function 'uct_cuda_ipc_get_unique_index_for_uuid': /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:110:40: error: 'ucs_ternary_value_t' undeclared (first use in this function) sizeof(ucs_ternary_value_t); ^ /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c: In function 'uct_cuda_ipc_is_peer_accessible': /scrap/azure/agent-04/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.c:146:5: error: unknown type name 'ucs_ternary_value_t' ucs_ternary_value_t *accessible; ^ cc1: all warnings being treated as errors

@yosefe Remaining errors seem unrelated. Can you review the new changes when possible?

yosefe · 2021-02-02T21:00:40Z

/azp run

azure-pipelines · 2021-02-02T21:00:53Z

Azure Pipelines successfully started running 1 pipeline(s).

Akshay-Venkatesh · 2021-02-08T16:37:58Z

@yosefe I see that failing tests are complaining about ucs_ternary_value_t again. Any suggestions on this? Also, do the latest commits look good and address #5815 (comment)?

yosefe · 2021-02-08T16:44:16Z

`/scrap/azure/agent-05/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.h:22:5: error: unknown type name 'ucs_ternary_value_t'
looks like some include is missing

yosefe · 2021-02-08T16:51:31Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

+        uct_cuda_ipc_accessible_cache_init(md->peer_accessible_cache + original_cache_size,
+                                           new_capacity - original_capacity);


i think we don't need this as a separate function, can just expand the code here

Akshay-Venkatesh · 2021-02-08T17:28:15Z

`/scrap/azure/agent-05/AZP_WORKSPACE/2/s/contrib/../src/uct/cuda/cuda_ipc/cuda_ipc_md.h:22:5: error: unknown type name 'ucs_ternary_value_t'
looks like some include is missing

@yosefe I had config/types.h included earlier but it didn't matter so I removed the header. Now I added it back but I still see the same error. I don't see any local build failures. Maybe there's something else missing.

yosefe · 2021-02-08T22:05:53Z

@yosefe I had config/types.h included earlier but it didn't matter so I removed the header. Now I added it back but I still see the same error. I don't see any local build failures. Maybe there's something else missing.

pls use ucs_ternary_auto_value_t

yosefe · 2021-02-10T08:23:49Z

src/uct/cuda/cuda_ipc/cuda_ipc_md.c

+        buffer    = md->peer_accessible_cache + original_cache_size;
+        num_elems = new_capacity - original_capacity;
+
+        for (elem = 0; elem < num_elems; elem++) {
+            p = UCS_PTR_BYTE_OFFSET(buffer, elem * offset);
+            *p = UCS_TRY;
+        }


for (i = original_cache_size; i < new_capacity ; ++i) {
md->peer_accessible_cache[i] = UCS_TRY;
}

yosefe

pls squash

bureddy reviewed Oct 20, 2020

View reviewed changes

brminich reviewed Oct 22, 2020

View reviewed changes

brminich approved these changes Oct 26, 2020

View reviewed changes

Akshay-Venkatesh changed the title ~~UCT/CUDA: move cuda_ipc cache from ep to iface~~ UCT/CUDA: Make cuda_ipc cache global Oct 27, 2020

yosefe reviewed Nov 8, 2020

View reviewed changes

bureddy approved these changes Nov 18, 2020

View reviewed changes

yosefe reviewed Nov 23, 2020

View reviewed changes

yosefe reviewed Dec 4, 2020

View reviewed changes

yosefe reviewed Jan 29, 2021

View reviewed changes

yosefe reviewed Jan 31, 2021

View reviewed changes

yosefe reviewed Feb 8, 2021

View reviewed changes

yosefe reviewed Feb 10, 2021

View reviewed changes

yosefe approved these changes Feb 11, 2021

View reviewed changes

UCT/CUDA: make cuda_ipc cache global and thread-safe

d689192

Akshay-Venkatesh force-pushed the topic/cuda-ipc-iface-cache branch from 0126e33 to d689192 Compare February 11, 2021 22:26

yosefe merged commit 8064365 into openucx:master Feb 12, 2021

This was referenced Feb 17, 2021

UCT/CUDA_IPC: fix peer-access-map init #6360

Merged

UCT/CUDA_IPC: make cuda-ipc cache global - v1.10.x #6364

Merged

pentschev mentioned this pull request Feb 22, 2021

cuda_ipc fails if send/recv buffers are already registered for CUDA IPC #3192

Closed

rakhmets mentioned this pull request Jul 10, 2023

UCT/CUDA/CUDA_IPC: Minor code improvements. #9198

Merged


		ucs_recursive_spin_lock(&iface->remote_cache.lock);

		khiter = kh_put(cuda_ipc_rem_cache, &iface->remote_cache.hash, pid,

		ucs_recursive_spin_lock(&uct_cuda_ipc_remote_cache.lock);

		key.pid = pid;


		err:
		ucs_recursive_spin_unlock(&uct_cuda_ipc_remote_cache.lock);

		uct_cuda_ipc_accessible_cache_init(md->peer_accessible_cache + original_cache_size,
		new_capacity - original_capacity);

UCT/CUDA: Make cuda_ipc cache global #5815

UCT/CUDA: Make cuda_ipc cache global #5815

Conversation

Akshay-Venkatesh commented Oct 20, 2020

What/Why

How ?

swx-jenkins3 commented Oct 20, 2020

Akshay-Venkatesh commented Oct 20, 2020

Akshay-Venkatesh commented Oct 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pentschev commented Oct 20, 2020

brminich commented Oct 22, 2020

brminich commented Oct 22, 2020

azure-pipelines bot commented Oct 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petro-rudenko commented Oct 26, 2020

Akshay-Venkatesh commented Oct 26, 2020 via email

petro-rudenko commented Oct 26, 2020 • edited Loading

Akshay-Venkatesh commented Oct 27, 2020

petro-rudenko commented Oct 27, 2020

pentschev commented Oct 27, 2020

hoopoepg commented Nov 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Akshay-Venkatesh commented Nov 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Akshay-Venkatesh commented Dec 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Akshay-Venkatesh commented Dec 6, 2020

yosefe commented Dec 15, 2020

azure-pipelines bot commented Dec 15, 2020

yosefe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yosefe commented Feb 2, 2021

azure-pipelines bot commented Feb 2, 2021

petro-rudenko commented Oct 26, 2020 •

edited

Loading