enable gpu-aware MPI by default #121

BenWibking · 2022-09-28T17:07:40Z

It turns out GPU-aware MPI works great, as long as managed memory is disabled and all CUDA devices are visible from each MPI process. These conditions mean that with OpenMPI, the cuda_ipc transport will be used.

On-node scaling is now ~99% efficient on Delta, Gadi.

Note: this will need to be re-tested for Setonix and Frontier.

For AMD devices, see also: https://docs.amd.com/bundle/AMD-Instinct-MI250-High-Performance-Computing-and-Tuning-Guide-v5.3/page/GPU-Enabled_MPI.html

BenWibking · 2022-09-28T18:37:45Z

/azp run

azure-pipelines · 2022-09-28T18:38:04Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-28T18:49:13Z

/azp run

azure-pipelines · 2022-09-28T18:49:30Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-28T19:21:11Z

For the HydroBlast3D problem (with max_grid_size set to $256^3$), we get:

1 A100: 217.6 Mupdates/s ( $256^3$ problem)
4 A100s / 1 node: 864.4 Mupdates/s ( $512^3$ problem)

for a scaling efficiency of >99%.

BenWibking · 2022-09-28T19:23:43Z

/azp run

azure-pipelines · 2022-09-28T19:24:00Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-29T01:58:19Z

GPU tests are failing because fextract does not work if MultiFabs are not accessible from the host.

BenWibking · 2022-09-30T17:49:56Z

/azp run

azure-pipelines · 2022-09-30T17:50:16Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T17:55:22Z

/azp run

azure-pipelines · 2022-09-30T17:55:38Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T18:57:28Z

/azp run

azure-pipelines · 2022-09-30T18:57:47Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T19:04:26Z

/azp run

azure-pipelines · 2022-09-30T19:04:42Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T19:17:44Z

/azp run

azure-pipelines · 2022-09-30T19:18:00Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T19:32:13Z

/azp run

azure-pipelines · 2022-09-30T19:32:29Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T19:38:10Z

/azp run

azure-pipelines · 2022-09-30T19:38:27Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T20:05:23Z

/azp run

azure-pipelines · 2022-09-30T20:05:41Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-09-30T22:26:19Z

/azp run

azure-pipelines · 2022-09-30T22:26:35Z

Azure Pipelines successfully started running 4 pipeline(s).

BenWibking · 2022-10-01T15:37:27Z

/azp run

azure-pipelines · 2022-10-01T15:37:48Z

Azure Pipelines successfully started running 4 pipeline(s).

* enable gpu-aware MPI by default * updated README for GPU-aware MPI * updated batch scripts * change fextract to use device-allocated data * copy reference solution to device when needed * interpolate.h -> interpolate.hpp * initialize on device * avoid LoopConcurrentOnCpu

This reverts commit 4250108.

enable gpu-aware MPI by default

7918750

BenWibking added 2 commits September 28, 2022 13:45

updated README for GPU-aware MPI

e8c15af

update README formatting

4221677

updated batch scripts

d9dcdef

This was referenced Sep 29, 2022

silent performance degradation when cuda_ipc is not possible open-mpi/ompi#10871

Open

enable GPU-aware MPI when performance conditions are met AMReX-Codes/amrex#2967

Closed

BenWibking added 2 commits September 30, 2022 17:31

change fextract to use device-allocated data

284e4f9

add device annotation

3a41537

BenWibking added 2 commits September 30, 2022 18:39

copy reference solution to device when needed

b1b75a7

interpolate.h -> interpolate.hpp

c7319a5

BenWibking added 2 commits September 30, 2022 18:59

add device annotations

e9a6d44

initialize on device

d7c2410

fix DeviceVector usage

5312ee0

reserve() -> resize()

7e541d4

avoid LoopConcurrentOnCpu

93e604c

fix fextract bug

e5b5790

BenWibking merged commit 851cbdd into development Oct 1, 2022

BenWibking deleted the use-cuda-ipc-default branch October 1, 2022 16:37

BenWibking added a commit that referenced this pull request Oct 3, 2022

Revert "enable gpu-aware MPI by default (#121)"

d3bc5e6

This reverts commit 4250108.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable gpu-aware MPI by default #121

enable gpu-aware MPI by default #121

BenWibking commented Sep 28, 2022 •

edited

Loading

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 28, 2022 •

edited

Loading

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 29, 2022 •

edited

Loading

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Oct 1, 2022

azure-pipelines bot commented Oct 1, 2022

enable gpu-aware MPI by default #121

enable gpu-aware MPI by default #121

Conversation

BenWibking commented Sep 28, 2022 • edited Loading

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 28, 2022 • edited Loading

BenWibking commented Sep 28, 2022

azure-pipelines bot commented Sep 28, 2022

BenWibking commented Sep 29, 2022 • edited Loading

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Sep 30, 2022

azure-pipelines bot commented Sep 30, 2022

BenWibking commented Oct 1, 2022

azure-pipelines bot commented Oct 1, 2022

BenWibking commented Sep 28, 2022 •

edited

Loading

BenWibking commented Sep 28, 2022 •

edited

Loading

BenWibking commented Sep 29, 2022 •

edited

Loading