-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory model umbrella ticket #229
Comments
|
Definition of concurrency used in #204 to be resolved in this. |
Have renamed (can be improved) to distinguish from #172 |
Is the related statement a mistake (get/g/iget ->put/p/iput) ?
Do you want to order the delivery of data returning to the local buffer, or order the read access of the remote memory ? I think only the later is useful. E.g., a user may want to do
I am not sure if I understand this topic correctly. I have two questions: (1) Does
|
|
- fence also orders non-blocking get/g/igetÂ
Sorry, I admit to not following this discussion closely enough,
...'orders', not 'completes' ...
get_nbi()
fence()
get_nbi()
so I'm guaranteed that the second get will be ordered after the first get?
PE 0 PE 1 PE n
put(data, pe1)
fence();
put(signal=1, pe1)
get_nbi(signal, pe1)
fence()
get_nbi(data, pe1)
If <n> gets signal=1 then data is valid (from PE 0) since it was ‘signalled’
From: Anshuman Goswami [mailto:[email protected]]
Sent: Tuesday, July 24, 2018 2:27 PM
To: openshmem-org/specification <[email protected]>
Cc: Subscribed <[email protected]>
Subject: Re: [openshmem-org/specification] Memory Model (ordering + reads from = happens before) (#229)
@minsii<https://github.com/minsii>
Is the related statement a mistake (get/g/iget ->put/p/iput) ?
No, the proposal is to make g/get/iget unordered as it requires memory fences on some relaxed architectures (if the data from get is used, ordering is enforced by the compiler/architecture)
Do you want to order the delivery of data returning to the local buffer, or order the read access of the remote memory ?
The original context for this was a comment from @nspark<https://github.com/nspark> on the draft that fence ordering blocking and non-blocking put but only blocking get may be non-intuitive. A follow up question - why is the ordering of the local buffer update (for non-blocking get) not useful? Is it not a requirement for message passing to work?
I have two questions: (1) Does wait_until check the return buffer of nonblocking fetch AMO on the source PE ? (2) What is the read-modify-write-fetch operation and when it is needed ?
(1) Yes, this was suggested by @jdinan<https://github.com/jdinan> in the discussion on the mailing list.
(2) This is related to your comment in the same thread - Will it increases the overhead of fetching AMOs if we need atomicity guarantee ? E.g., is there any network that supports atomicity for the returning data transfer of AMO ? Is my understanding correct?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#229 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AH7d-ha-jMQ3K9K5FoAH-kk50iXTkPGVks5uJ3T8gaJpZM4VXMLn>.
|
Thanks @bcernohous for the example. If I could use it to clarify my earlier comment : fence orders get_nbi implies that the local update is ordered. @minsii comments? |
@bcernohous: The example seems a little problematic to me. How do you guarantee that
|
My email was an example with questions 😊
so I'm guaranteed that the second get will be ordered after the first get?
If <n> gets signal=1 then data is valid (from PE 0) since it was ‘signalled’
And I guess the answer is yes to both?
From: Anshuman Goswami [mailto:[email protected]]
Sent: Tuesday, July 24, 2018 4:28 PM
To: openshmem-org/specification <[email protected]>
Cc: Bob Cernohous <[email protected]>; Mention <[email protected]>
Subject: Re: [openshmem-org/specification] Memory Model (ordering + reads from = happens before) (#229)
Thanks @bcernohous<https://github.com/bcernohous> for the example. If I could use it to clarify my earlier comment : fence orders get_nbi implies that the local update is ordered. @minsii<https://github.com/minsii> comments?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#229 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AH7d-iYAWXscz4GaZaP7B9taRbrw-jBsks5uJ5F6gaJpZM4VXMLn>.
|
I was questioning if that was how my example was supposed to work.
How do you guarantee that get_nbi(signal, pe1) reads signal on PE1 after the update of put(signal=1, pe1) ?
I don’t. It was an ordering question. *If* pe <n> gets signal=1 , then the data is from PE 0. PE <n> could get signal = ? (in my poor example) and there would be nothing else you could assert about the ordering.
As I said, I haven’t followed this discussion closely enough and was surprised that fence orders get_nbi, and I’m trying to understand it too.
From: Min Si [mailto:[email protected]]
Sent: Tuesday, July 24, 2018 4:43 PM
To: openshmem-org/specification <[email protected]>
Cc: Bob Cernohous <[email protected]>; Mention <[email protected]>
Subject: Re: [openshmem-org/specification] Memory Model (ordering + reads from = happens before) (#229)
@bcernohous<https://github.com/bcernohous>: The example seems a little problematic to me. How do you guarantee that get_nbi(signal, pe1) reads signal on PE1 after the update of put(signal=1, pe1) ? Do you have to add another synchronization between PE0 and PEn ? E.g., PE n must issue the get_nbi operations after completion of PE 0's put(signal=1, pe1).
PE 0 PE 1 PE n
put(data, pe1)
fence();
put(signal=1, pe1)
get_nbi(signal, pe1)
fence()
get_nbi(data, pe1)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#229 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AH7d-o4YdP3yCK923O5WWeJdnJ8Xhvteks5uJ5TUgaJpZM4VXMLn>.
|
|
@minsii From 9.5.4 in spec v1.4 description of shmem_get - |
@bcernohous |
@anshumang: The blocking
|
The proposal is to relax this requirement. |
@minsii out-of-order core may executer in-depended loads (a.k.a. shmem "blocking" get ) out of order. |
@anshumang What is really surprising is that Cray T3D ("father" of openshmem) was using Alpha, which is out-of-order core. I just cannot imagine ordered loads on this platform. Original spec also had explicit ops for cache management. So my guess the ordering was happing through the cache invalidation routines. Otherwise you can complete the load from the local memory regardless what other side "put" there. Looking at the original manual I only see barrier, no shmem_fence or shmem_quite operations. My guess these two were introduced post 1994. |
Thanks for the comments @shamisp Can you please add the pointer to the original spec? |
@shamisp I am still confused how the out-of-order cores can reorder blocking gets and be visible to user programs, and For network-offloaded get:
Should the mechanism of (3) ensures that (2) has already been performed and completed ? For active-message based get:
I could imagine out-of-order execution of (3) and (4) in the AM-based case, but (3) must be done when program loads Reading again the slides @anshumang used in WG calls, I understood that the proposal is to require Now thinking about the threaded program, where
|
Slides discussed in OpenSHMEM 2018 F2F |
Keynote by Will Deacon from OpenSHMEM Workshop 2018 |
Summary : ordering (inside a PE) + reads from (between two PEs) = happens before (across all PEs)
Following are the items that have been discussed in RMA WG on 6/21, 7/5 and 7/19 and are still open (except one marked by ^). They are grouped below 1) under ordering, 2) reads from and 3) happens before.
None yet
The text was updated successfully, but these errors were encountered: