Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CUDA-aware MPI support detection for MVAPICH, MPICH and ParaStationMPI #522

Merged
merged 3 commits into from
Apr 3, 2020

Conversation

Markus-Goetz
Copy link
Member

Description

Added support for CUDA-aware MPI for the other three supporting MPI platforms - MVAPICH, MPICH and ParaStationMPI.

NOTE: if your MPI installation is compiled with CUDA-aware MPI support it may still not use it by default. For most of them a specific environment variable needs to be set our code checks for.

Issue/s resolved: #438

Type of change

Remove irrelevant options:

  • New feature (non-breaking change which adds functionality)

Due Diligence

  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

@codecov
Copy link

codecov bot commented Apr 2, 2020

Codecov Report

Merging #522 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #522   +/-   ##
=======================================
  Coverage   96.59%   96.60%           
=======================================
  Files          68       68           
  Lines       14089    14092    +3     
=======================================
+ Hits        13609    13613    +4     
+ Misses        480      479    -1     
Impacted Files Coverage Δ
heat/core/types.py 94.73% <ø> (ø)
heat/core/communication.py 89.29% <100.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b533215...1e4913b. Read the comment docs.

Copy link
Member

@krajsek krajsek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to tested the different options?

@Markus-Goetz
Copy link
Member Author

Markus-Goetz commented Apr 2, 2020

I tested MVAPICH, seems to work fine on our cluster. Parastation has been tested on HDFML, it results in a permission denied error that Alex Strube is currently working on, but it definitely changes the behaviour of the MPI stack. I was unable to test MPICH as there is non available. However it is the canonical way according to their docs.

# check whether OpenMPI support CUDA-aware MPI
if "openmpi" in os.environ.get("MPI_SUFFIX", "").lower():
buffer = subprocess.check_output(["ompi_info", "--parsable", "--all"])
CUDA_AWARE_MPI = b"mpi_built_with_cuda_support:value:true" in buffer
else:
CUDA_AWARE_MPI = False
# MVAPICH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way to automatically get this? or even better, to automatically turn on cuda mpi?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the modules could set the respective environment variable on load. Programatically from Heat there is no way (that I know of) in telling whether the binaries are actually compiled with CUDA support. There is the hacky possibility of checking ldd output and looking whether the MPI libs attempt to dynamically load the CUDA shared objects.

@coquelin77 coquelin77 merged commit 2cec7a2 into master Apr 3, 2020
@coquelin77 coquelin77 deleted the features/438-cuda-aware-mpi branch April 3, 2020 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CUDA-aware MPI support for other MPI stacks
3 participants