Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comm: beef up use of PMIx_Group_construct #12960

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

hppritcha
Copy link
Member

to avoid potential race conditions between successive calls to MPI_Comm_create_from_group and MPI_Intercomm_create_from_groups when using the same tag argument value.

The PMIx group constructor grp string argument has different semantics from the tag requirements for these MPI constructors, so use discriminators to avoid potential race conditions when using PMIx group ops.

Related to #10895

to avoid potential race conditions between successive calls
to MPI_Comm_create_from_group and MPI_Intercomm_create_from_groups
when using the same tag argument value.

The PMIx group constructor grp string argument has different semantics
from the tag requirements for these MPI constructors, so use
discriminators to avoid potential race conditions when using PMIx group
ops.

Related to open-mpi#10895

Signed-off-by: Howard Pritchard <[email protected]>
@rhc54
Copy link
Contributor

rhc54 commented Dec 4, 2024

Hmmm...I must be missing something. I applied your patch to the head of OMPI main and ran Jeff's program from #12906, but I'm not seeing this revised PMIx group ID being constructed anywhere, and the program still hangs.

@hppritcha
Copy link
Member Author

sorry that PR is not directly to this issue. we have two problems - one is the race case in mpi4py when using pmix groups (that's what the PR #12960 is for), then there's this one. so it was somewhat of a mistake to say this issue is related to that pr. i'm looking at this one as well but i suspect the fix will be in a totally different place as the spawn operation uses totally different pmix functionality.

@hppritcha
Copy link
Member Author

well i thought i was updating the issue and not my pr. so the text above is more for issue #12906

@rhc54
Copy link
Contributor

rhc54 commented Dec 4, 2024

😄 Got it - thanks for the explanation!

@hppritcha hppritcha requested review from bwbarrett and edgargabriel and removed request for bwbarrett December 4, 2024 16:08
@hppritcha hppritcha removed the request for review from edgargabriel December 9, 2024 20:11
@hppritcha hppritcha merged commit 99bec5a into open-mpi:main Dec 9, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants