Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collectives: fix inconsistency in calling preconditions #25

Conversation

davidozog
Copy link

@davidozog davidozog commented Sep 30, 2024

Summary of changes

@stewartl318 noticed a likely typo/inconsistency throughout the collectives conditions before a collective is called.

Here is the problem sentence (emphasis mine):

"Before any PE calls [a collective having src/dest buffers] the following conditions must be ensured:
• The dest array on all PEs in the team is ready to accept the result of the operation.
• The source array at the local PE is ready to be read by any PE in the team."

We think this is technically wrong, and should say:

"Before the local PE calls [a collective having src/dest buffers] the following conditions must be ensured..."

The reason "Before any PE..." is not quite right anymore is because it suggests the application must assure the source buffer is ready everywhere before the collective is entered anywhere, which requires user synchronization - the exact opposite of what we intended with the subsequent sentence:

"The application does not need to synchronize to ensure that the source array is ready across all PEs prior to calling this routine"

Proposal Checklist

  • Link to issue(s)
  • Changelog entry
  • Reviewed for changes to front matter
  • Reviewed for changes to back matter

@jdinan
Copy link
Owner

jdinan commented Sep 30, 2024

@davidozog Please create a new branch for RC2 changes from the collectives section committee and integrate these changes there. Please also discuss at the next WG meeting. The bar for integrating changes into RC2 should be kept high, as any issues that we find at the RC2 meeting will delay ratification.

@jdinan
Copy link
Owner

jdinan commented Sep 30, 2024

My $0.02 is that the meaning is roughly the same with both wordings, but I prefer any since it prompts the reader to think about these as properties that must hold globally on the team.

@davidozog
Copy link
Author

Ah ok. It sounds like we we agree on the intended semantics thankfully. I will close the ticket and double check with the WG that "any PE" makes sense here.

My concern is that using only the word "any", one could misconstrue the semantic like, "Before any one PE can call the collective" which is not intended here. Perhaps a topic for the WG, thanks.

@davidozog davidozog closed this Sep 30, 2024
@jdinan
Copy link
Owner

jdinan commented Sep 30, 2024

I think the intended semantic is "any one" unless I'm misunderstanding the concern.

@davidozog
Copy link
Author

davidozog commented Sep 30, 2024

I think the intended semantic is "any one" unless I'm misunderstanding the concern.

My apologies - I did not provide any context above... "one" doesn't matter at all. Let me try to provide more context...

I think the confused line-of-thinking we are trying to correct goes something like this: if "before any PE calls a collective.... the source array at the local PE [must be] ready to be read by any [other] PE in the team", then I as a user must assure all PEs have exited every synchronization up to this point, because otherwise I cannot know if the source array is ready to be read by any other PE. So I must provide a sync around the source buffer before calling the collective... but this conflicts with

The application does not need to synchronize to ensure that the source array is ready across all PEs prior to calling this routine

I think the problem may center around the double-use of the word "any" here, specifically "any [other]" works for the 2nd instance but not the 1st instance. But @stewartl318 should please clarify if I'm misrepresenting the concern.

@lstewart
Copy link

lstewart commented Oct 9, 2024

My confusion with the current wording is that "any" is used twice but with different meaning.
The "before any PE calls" is intended to mean "it doesn't matter which one you are considering, but focus on one"
The "ready to be read by any PE" is intended to mean "all PEs, consider each one in turn"
In addition, the word "local" means "whichever PE you decided to think about in the opening sentence"

When you interpret with these variant meanings, I agree it means that other PEs are not allowed to read the source buffer of the local PE until the local PE calls the function.

The clarity would be much improved by "A PE must not read the source buffer of a particular remote PE xx before PE xx has called the function" Destination buffers must be ready everywhere before the first PE calls the function. Source buffers need only be ready as their individual PEs call the function.

In yet another wording, there must be a synchronizing event between the make-ready of the destination buffers and the call, but there doesn't have to be a synchronizing event between the make-ready of the source buffer and the call.

@lstewart
Copy link

lstewart commented Oct 9, 2024

I do think Dave's proposed wording is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants