Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Bugfix] Add stream synchronization before the scatter operation (#73)
This is to address the issue from this PR: rapidsai/wholegraph#229, and it's only for the last scatter operation before the Python interface (not for all internal `scatter_func` calls) Since the output of the scatter operation could be on the host (e.g., when emb_device = 'cpu'), it is necessary to perform synchronization internally. This ensures users do not need to explicitly synchronize the compute stream before accessing the host memory. Unlike the gather operation, where the output is always in device memory, host side synchronization is unnecessary. Authors: - Chang Liu (https://github.com/chang-l) - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - https://github.com/linhu-nv URL: #73
- Loading branch information