-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test all env query functions for compliance with MPI-3.1 thread safety compliance #493
Comments
No, the mt_env.c test bails out if the MPI_THREAD_MULTIPLE mode is not supported. |
GET_VERSION abd GET_LIBRARY_VERSION are multi-threaded safe. For INITIALIZED, FINALIZED, QUERY_THREAD, and IS_THREAD MAIN, I'm thinking that all we need to make the following globals OMPI_DECLSPEC extern bool ompi_mpi_initialized;
OMPI_DECLSPEC extern bool ompi_mpi_finalized;
OMPI_DECLSPEC extern int ompi_mpi_thread_provided;
OMPI_DECLSPEC extern struct opal_thread_t *ompi_mpi_main_thread; |
I don't think they should be volatile. However, I do think they should be accessed via atomic operations. Moreover, we need to redesign our init/finalize process in order to remove all possible race conditions between the init and finalized states (such as once finalize is started, init does not return success). |
Forgive my ignorance: why access via atomics vs. volatile? As for the behavior of MPI_INITIALIZED, MPI-3.1 (and 3.0) is quite clear:
So I think our current implementation of that aspect is correct. |
Thanks for pointing out the text about MPI_FINALIZE, I was not aware that it has no impact on the behavior of MPI_INITIALIZED. Using volatile will prevent the compiler from caching the variables, so this might look like a good idea. However, such behavior is only maintained in the context of a single function. As long as MPI_INITIALIZED is not a macro used in a busy-loop, there will be no benefit in using volatile. Imagine a pretty standard usage of MPI_INITIALIZED and MPI_INIT
If we call this code from multiple threads at once, the current Open MPI will not behave correctly as there might be a case where both threads calls ompi_mpi_init and start initializing the internals. Thus, we need to protect the different stages of the initialization/finalization process. |
With the MPI API specified as it is, I don't think there's a way to guarantee that the code snipit you proposed will always be safe when executed by multiple threads. I.e., there's no atomic "test for INITIALIZED, and if not INITIALIZED, then INIT." There was much argument about this in the Forum (i.e., make a new INIT function that does stuff like this), but after much round-and-round debate, the minimum distance compromise was made to just guarantee that INITIALIZED (and the others) are always thread safe -- which at least solved a few problems. That was (literally) the smallest step that could be made. Making a new INIT was punted further down the road... Are you proposing to lock the internals if |
I am not sure about returning a non-fatal MPI error, I don't think the standard supports such type of exception. What I was proposing is to protect the internals of the MPI initialization and force all subsequent MPI_Init in a wait mode until the first MPI_Init complete. Upon completion of the first MPI_Init, all MPI_Init return with the same error code (this goes together with MPI_INITIALIZED returning the flag to true only after a successful complete initialization). This approach seems to be valid from the MPI standard perspective, while proposing a logical and safe approach for our users. That being said, I remember you were involved in the new MPI standard design regarding the thread-safety of some of the init/fini functions. So please enlighten me on how the Forum regards the thread safety of the init/fini tuple. So far, every time I get more information about this I just get more confused about the lack of coherent logic regarding this topic (that's nothing new, just unsettling). It reads to me like "I cast on you thy few thread-safe functions that cannot be used together in a thread-coherent way". Or are we assuming a sequential initialization of any MPI-based application (despite the Amdahl's law)? |
@bosilca You're right -- there is no definitive solution for this issue in an MPI-3.1-compliant way (see my comment above). What is really needed is an MPI_INIT_IF_NOT_ALREADY_INIT() kind of function. ...or something entirely different (e.g., ideas like I presented in Bordeaux at EuroMPI 2015). @hppritcha and I looked at this today. We agree that all we can do is narrow the race condition window if multiple threads call MPI_INITIALIZED simultaneously. And we should do that (e.g., strengthen the checking in MPI_INIT and MPI_FINALIZE to check for a variable that is set immediately upon starting, and possibly protect that with a lock). But this ticket is about ensuring that it is safe to call all these functions from multiple threads safely. Right? |
I am not really sure that we agree on the problem. Let's take a simple example where two threads are calling the sequence of code from above:
Now, let's assume that we fix all the issues about the use of our internal variables holding the current state of the library. What should MPI_INITIALIZED return and starting from when? We update the internals upon entry into the MPI_INIT, so technically MPI_INITIALIZED would return true once another thread entered in MPI_INIT, but way before the MPI stack is in a usable form. Now, if instead we move the update of the internal variables at the end of MPI_INIT, the MPI_INITIALIZED will return false, and the second thread will also call MPI_INIT. So now we will have to block this thread until the first one completed the MPI_INIT ... Thus we need a condition. |
Mmmm... true, if we move the "allow MPI_INITIALIZED to return TRUE" to up earlier in MPI_INIT, that's probably dangerous for a different reason -- e.g. two threads calling this: MPI_Initialized(&flag);
if (!flag) {
MPI_Init(NULL, NULL);
}
MPI_Send(...); If two threads concurrently execute this code, two kinds of race conditions can occur:
Are you advocating that MPI_INITIALIZED check to see if MPI_INIT has started, and if so, block until MPI_INIT completes, and then return flag=true? If so, that narrows the race condition window, but it doesn't eliminate it. I'm not sure there is a way to eliminate the race condition in MPI-3.1. |
I was advocating the MPI_INITIALIZED returns false until the MPI_INIT complete, to prevent exactly what you described. But your proposal might be simpler to implement. Block all initialization functions (including accessors to check the state) as soon as one initialization is in progress. |
see related discussion in open-mpi/ompi-release#636 |
@bosilca @hppritcha and I discussed this on the phone today. We decided:
Then we have to wait for MPI-4 to fix this issue for real. |
Minor correction to the i) case. The end of the sentence should read "as many times as you want without an error before the first call to MPI_FINALIZE." |
To close the loop on this issue: after I made the comment above (#493 (comment)), I decided not to support the "let it be ok if MPI is initialized more than once" functionality. See #1007 for details. @hppritcha Still wants to do some testing, though -- so this issue is still open. |
Fix PMI and PMI2 builds
I'm removing blocker label, and changing to future. This issue was really just serving as a placeholder for adding some thread safety tests. |
think we've added enough tests over time that this issue can be closed. |
The MPI-3.1 standard states that
are callable from threads without restrictions irrespective of the actual level of thread support provided.
It appears the the sun/threads/mt_env.c covers most, but not all of these. This test could either be enhanced or a similar test could be added to the ibm tests in opmi-tests.
The text was updated successfully, but these errors were encountered: