Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize GPU boundary exchanges via NVSHMEM #1

Open
bentsherman opened this issue Sep 14, 2021 · 1 comment
Open

Optimize GPU boundary exchanges via NVSHMEM #1

bentsherman opened this issue Sep 14, 2021 · 1 comment

Comments

@bentsherman
Copy link

NVSHMEM is an implementation of OpenSHMEM for Nvidia GPUs:

https://developer.nvidia.com/nvshmem
https://docs.nvidia.com/hpc-sdk/nvshmem/api/docs/index.html

It is essentially an alternative to MPI that allows the GPUs to communicate directly with the interconnect, instead of going through the CPU for MPI communications. The API is very similar to MPI but with slightly different terminology (init, finalize, PEs, teams, put/get, collective ops). Additionally, the memory model is slightly different.

This would be a great way to optimize the boundary exchanges, which currently represent the majority of communication overhead in the multi-GPU scenario. A big downside is that you probably can't have MPI and NSHMEM in the same program. You might be able to have a wrapper library that defers to either MPI or NVSHMEM based on whether or not GPUs are enabled, but more likely you will need to have separate binaries for cpu/gpu.

@bentsherman
Copy link
Author

This repo has code examples for all the different ways to implement a Jacobi solver with multi-GPU:

https://github.com/NVIDIA/multi-gpu-programming-models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant