-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signal operation type for put with signal #929
Conversation
Signed-off-by: Md <[email protected]>
Signed-off-by: Md <[email protected]>
Signed-off-by: Md <[email protected]>
src/shr_transport.h4
Outdated
shmem_transport_xpmem_get(&old_signal_val, sig_addr, sizeof(uint64_t), pe, | ||
shmem_internal_get_shr_rank(pe)); | ||
signal += old_signal_val; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be an atomic operation, use shmem_shr_transport_atomic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdinan shmem_shr_transport_atomic
is enabled through USE_SHR_ATOMICS
. Unless we use --enable-shr-atomics
flag, we will not be able to use this API, right? Please let me know if I am missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the target is waiting for several messages using the same signal variable, this will not work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you would need to write code to support builds with and without shared memory atomics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jdinan. One question, from the current code, there seems to be no relation between the flags USE_SHR_ATOMICS
and USE_XPMEM
/USE_CMA
. Was there any particular reason behind this? I am thinking shouldn't USE_SHR_ATOMICS
be defined when either of these on-node transports is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nearly all builds, we can't enable shared memory atomics because the networking layer is not coherent with processor atomics. By default, we only enable shared memory copy to implement put/get. This is why shared memory atomics are a separate option, and disabled by default. CMA is essentially memory copy performed by the Linux kernel and it cannot support atomics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CMA, we should directly use the transport layer atomics then, as you also suggested later. But, for XPMEM, can we say that enabling USE_XPMEM should enable USE_SHR_ATOMICS as well? We can use the shared memory atomics if shmem_internal_get_shr_rank
returns something other than -1, right? But, currently, shmem_shr_transport_use_atomic
also seems to be guarded by USE_SHR_ATOMICS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it is always safe to use shared memory put/get because OpenSHMEM says overlapping operations (e.g. between local shared memory and remote network transport accesses) leads to undefined behavior. Atomics do allow this overlapping. Therefore, it is only safe to use shared memory atomics when you know the transport layer is coherent with processor atomics. Because of this, enabling XPMEM only enables shared memory put/get and there is an additional flag to enable shared memory atomics.
src/shr_transport.h4
Outdated
shmem_transport_cma_get(&old_signal_val, sig_addr, sizeof(uint64_t), pe, | ||
shmem_internal_get_shr_rank(pe)); | ||
signal += old_signal_val; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be an atomic. CMA does not support atomics; use shmem_transport_atomic
.
Signed-off-by: Md <[email protected]>
@jdinan Can you please review the changes in shr_transport? There is one test failing in Travis that I am taking a look, but looks like it is related to the unit test change. |
src/shr_transport.h4
Outdated
# define ONLY_XPMEM_TRANSPORT 1 | ||
#else | ||
# define ONLY_XPMEM_TRANSPORT 0 | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear, no. This will cause shared memory atomics to be enabled in some paths and not others. You should only need to check USE_SHR_ATOMICS
. This flag is set by the configure script. If you are concerned about it being set incorrectly, please resolve those concerns at configure time, not here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, shmem_wait
will not be using the correct atomics if USE_SHR_ATOMICS
is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I was thinking initially to use USE_SHR_ATOMICS
by defining it when only XPMEM is enabled in the configure script. But for some reason, I misunderstood that we wanted a separate flag.
src/shr_transport.h4
Outdated
shmem_internal_get_shr_rank(pe)); | ||
shmem_internal_membar_acq_rel(); /* Memory fence to ensure target PE observes | ||
stores in the correct order */ | ||
if (ONLY_XPMEM_TRANSPORT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be replaced by #if USE_SHR_ATOMICS
but other than change this looks looks ok.
…nabled Signed-off-by: Md <[email protected]>
The
|
Yes, I am trying to reproduce the issue locally. The test is modified with this PR to include a check for ADD signal atomic. Earlier, with no signal op argument, this test was not failing. |
@@ -414,6 +414,10 @@ AS_IF([test "$transport_xpmem" = "yes" -o "$transport_cma" = "yes"], | |||
AC_DEFINE([ENABLE_HARD_POLLING], [1], [Enable hard polling]) | |||
]) | |||
|
|||
if test "$enable_shr_atomics" = "no" -a "$transport_xpmem" = "yes" -a "$transport_ofi" = "no" -a "$transport_portals4" = "no" ; then | |||
AC_DEFINE([USE_SHR_ATOMICS], [1], [If defined, the shared memory layer will perform processor atomics.]) | |||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have always treated the above as an invalid build; the configure script should emit a warning any time there isn't a transport selected. That having been said, this seems like a good change. You might want to update the description of the --enable-shr-atomics
to say "default: auto" rather than "default: disabled".
A minor fix here, the check for "$enable_shr_atomics" = "no"
is not quite right (see Example 1.1 here). It's counterintuitive, but I think this check should be "$enable_shr_atomics" != "no"
i.e. "not disabled".
Signed-off-by: Md <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wrrobin Based on earlier discussion, these changes look good to me. I unfortunately won't have time to review the unit test, perhaps @davidozog can help there? One last suggestion, try configuring with "--disable-shr-atomics" and "--enable-shr-atomics" and look in src/config.h
under your build directory to ensure that the preprocessor macro is being defined correctly (I never trust myself to get the inverted logic of configure parameters correct).
Our CI testing is pretty thorough, apart from Portals, and this is an extension, so I wouldn't be opposed if you feel that it's ready. Since this PR includes an API change to bring the signaling API into compliance with the latest OpenSHMEM 1.5 draft, it might be nice to include. |
@jdinan I have tried with different combinations of
IIUC, the bold italic ones (4,5,7) above are producing incorrect configurations. Please let me know if you think so as well. For 7, I think it is producing the config that the user wants. But, I am not sure whether we can label the config as "auto" rather than "disabled". If I keep the configure script condition as before
then only the following result changes: which is the behavior we want to see. The 4th and 7th use-cases are still problematic. |
With your current code:
For 4, I see:
For 5:
That's the right behavior for 7. User got what they asked for. Can you please re-check? The above look like the correct behavior. Another possibility. There is a portability issue with string comparison in the test utility, which is why you often see |
I am not sure why I have seen those behaviors before. I am seeing the correct configuration now similar to yours. I am testing these on Cori and not sure now whether the behavior is random. |
@davidozog Can you please review the PR? If we do not see any other issues, we can merge this for the release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few comments/nit-picks.
Signed-off-by: Md <[email protected]>
@wrrobin and @jdinan - I've just run through the release checklist with this branch included and had no problems. I haven't yet tested the GNI provider with this patch though, because Cori is under maintenance - but I don't expect to see any issues. Are we ok with merging this and including in a v1.4.5 release today? |
@davidozog Sounds good to me. I tested with GNI (Cori) last time and all the unit and shmemx tests passed. |
No description provided.