Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Co-locale support for AMD GPUs #25846

Merged
merged 2 commits into from
Sep 3, 2024
Merged

Co-locale support for AMD GPUs #25846

merged 2 commits into from
Sep 3, 2024

Conversation

jhh67
Copy link
Contributor

@jhh67 jhh67 commented Aug 29, 2024

Add support for partitioning GPUs among co-locales.

Resolves https://github.com/Cray/chapel-private/issues/6640.

Add support for partitioning AMD GPUs among co-locales.

Signed-off-by: John H. Hartman <[email protected]>
@jhh67 jhh67 requested a review from jabraham17 August 29, 2024 20:30
@jhh67 jhh67 marked this pull request as ready for review August 29, 2024 20:30
runtime/src/gpu/amd/gpu-amd.c Outdated Show resolved Hide resolved
Signed-off-by: John H. Hartman <[email protected]>
@jhh67 jhh67 merged commit d6c8dc3 into chapel-lang:main Sep 3, 2024
7 checks passed
@jhh67 jhh67 deleted the amdgpu branch September 3, 2024 17:00
@jhh67 jhh67 restored the amdgpu branch September 3, 2024 20:21
jhh67 added a commit that referenced this pull request Oct 9, 2024
The PRs to add GPU support to co-locales (PRs
#25734 and
#25846) broke oversubscription
such that no locales had any GPUs. This PR fixes that problem, and
cleans up resource allocation with co-locales in general.
Oversubscription is handled more cleanly, as is the "remainder" node
that occurs when the number of locales is not evenly divisible by the
number of nodes.

[Reviewed by @jabraham17, thank you.]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants