Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/ofi: Lazily register FI_MULTI_RECV buffers #6422

Merged
merged 4 commits into from
Mar 2, 2023

Conversation

raffenet
Copy link
Contributor

@raffenet raffenet commented Feb 27, 2023

Pull Request Description

Avoid consuming GPU resources during initialization by using regular malloc for FI_MULTI_RECV buffers. Do lazy registration when the first GPU communication is detected.

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@raffenet
Copy link
Contributor Author

test:mpich/ch4/ofi

@raffenet
Copy link
Contributor Author

test:mpich/ch4/ofi
test:mpich/ch4/gpu/ofi

@raffenet
Copy link
Contributor Author

test:mpich/ch4/ofi
test:mpich/ch4/gpu/ofi

@raffenet
Copy link
Contributor Author

TODO: move CVAR for tmp buf registration into mpir_gpu.h utils. Will add patch and then this PR should be good.

@raffenet
Copy link
Contributor Author

test:mpich/ch4/ofi
test:mpich/ch4/gpu/ofi

@raffenet raffenet changed the title ch4/ofi: Use regular malloc of FI_MULTI_RECV buffers ch4/ofi: Lazily register FI_MULTI_RECV buffers Feb 28, 2023
@raffenet
Copy link
Contributor Author

test:mpich/ch4/ofi
test:mpich/ch4/gpu

@raffenet raffenet removed the WIP label Mar 1, 2023
@raffenet raffenet requested a review from hzhou March 1, 2023 16:57
@hzhou
Copy link
Contributor

hzhou commented Mar 1, 2023

@raffenet Can we do a gpu test with the CVAR disabling the host registration?

@raffenet
Copy link
Contributor Author

raffenet commented Mar 1, 2023

@raffenet Can we do a gpu test with the CVAR disabling the host registration?

Actually the CVAR disables registration by default. I can add a dummy commit to re-enable registration and re-run, if desired.

src/include/mpir_gpu.h Outdated Show resolved Hide resolved
@hzhou
Copy link
Contributor

hzhou commented Mar 1, 2023

@raffenet Can we do a gpu test with the CVAR disabling the host registration?

Actually the CVAR disables registration by default. I can add a dummy commit to re-enable registration and re-run, if desired.

I see. I was hoping some of the GPU testing failures can be addressed by not registering the buffer. But a bummer.

Can you confirm that we fixed the GPU memory issue? Since we do that lazy register, I think we can leave the CVAR default on, right?

@raffenet
Copy link
Contributor Author

raffenet commented Mar 2, 2023

@raffenet Can we do a gpu test with the CVAR disabling the host registration?

Actually the CVAR disables registration by default. I can add a dummy commit to re-enable registration and re-run, if desired.

I see. I was hoping some of the GPU testing failures can be addressed by not registering the buffer. But a bummer.

Can you confirm that we fixed the GPU memory issue? Since we do that lazy register, I think we can leave the CVAR default on, right?

Yes, I think we can default it to on. I'll double check a hello world program and confirm we don't take up any resources.

raffenet added 4 commits March 2, 2023 08:59
Avoid consuming GPU resources during initialization by using regular
malloc for FI_MULTI_RECV buffers. It may be possible to register the
buffers later if we detect they are being used to copy data to the GPU.
Rather than allocate a bunch of buffers, just use one big one with
offsets.
MPIR_gpu_register_host is used to register buffers on the host with the
GPU. Use a single CVAR to control buffer registration instead of
scattering in various parts of the code.
@raffenet
Copy link
Contributor Author

raffenet commented Mar 2, 2023

test:mpich/ch4/ofi
test:mpich/ch4/gpu

@raffenet
Copy link
Contributor Author

raffenet commented Mar 2, 2023

nvidia-smi output from a program held in wait loop after MPI_Init. Looks like we are no longer creating any resources.

[raffenet@pmrs-gpu-240-01]~% nvidia-smi
Thu Mar  2 09:09:08 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro RTX 4000     Off  | 00000000:5E:00.0 Off |                  N/A |
| 30%   32C    P8    10W / 125W |      3MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Quadro RTX 4000     Off  | 00000000:D8:00.0 Off |                  N/A |
| 30%   37C    P8     8W / 125W |      3MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

@raffenet
Copy link
Contributor Author

raffenet commented Mar 2, 2023

Test results are consistent with registration turned back on. Outstanding question is, do we want to have an additional switch in ch4/ofi to disable registration of FI_MULTI_RECV buffers in case, say, the provider is already handling it?

@hzhou
Copy link
Contributor

hzhou commented Mar 2, 2023

Test results are consistent with registration turned back on. Outstanding question is, do we want to have an additional switch in ch4/ofi to disable registration of FI_MULTI_RECV buffers in case, say, the provider is already handling it?

I see. You mean when provider will register the multi-recv buffer? I tend to believe that CUDA or any GPU runtime will cache the address and additional registration should be no-op. In any case, I would suggest that let's not worry about such case until they become a fact, and make decision then.

Copy link
Contributor

@hzhou hzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raffenet
Copy link
Contributor Author

raffenet commented Mar 2, 2023

Test results are consistent with registration turned back on. Outstanding question is, do we want to have an additional switch in ch4/ofi to disable registration of FI_MULTI_RECV buffers in case, say, the provider is already handling it?

I see. You mean when provider will register the multi-recv buffer? I tend to believe that CUDA or any GPU runtime will cache the address and additional registration should be no-op. In any case, I would suggest that let's not worry about such case until they become a fact, and make decision then.

Works for me.

@raffenet raffenet merged commit 86660ae into pmodels:main Mar 2, 2023
@raffenet raffenet deleted the ofi-am-bufs branch March 2, 2023 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants