-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancelling request fails for serialized communication #110
Comments
checking the source in class request::probe_handler
...
void cancel() { m_source = MPI_PROC_NULL; } does not call Before the refactoring (1.70), I had the following patch: diff --git a/boost/mpi/communicator.hpp b/boost/mpi/communicator.hpp
index 6e55b16..5bcd985 100644
--- a/boost/mpi/communicator.hpp
+++ b/boost/mpi/communicator.hpp
@@ -1760,6 +1760,8 @@ request::handle_serialized_irecv(request* self, request_action action)
// Wait for the count message to complete
BOOST_MPI_CHECK_RESULT(MPI_Wait,
(self->m_requests, &stat.m_status));
+ if (stat.cancelled())
+ return stat;
// Resize our buffer and get ready to receive its data
data->ia.resize(data->count);
BOOST_MPI_CHECK_RESULT(MPI_Irecv,
@@ -1772,6 +1774,8 @@ request::handle_serialized_irecv(request* self, request_action action)
BOOST_MPI_CHECK_RESULT(MPI_Wait,
(self->m_requests + 1, &stat.m_status));
+ if (stat.cancelled())
+ return stat;
data->deserialize(stat);
return stat;
} else if (action == ra_test) {
@@ -1783,6 +1787,9 @@ request::handle_serialized_irecv(request* self, request_action action)
BOOST_MPI_CHECK_RESULT(MPI_Test,
(self->m_requests, &flag, &stat.m_status));
if (flag) {
+ if (stat.cancelled())
+ return stat;
+
// Resize our buffer and get ready to receive its data
data->ia.resize(data->count);
BOOST_MPI_CHECK_RESULT(MPI_Irecv,
@@ -1797,11 +1804,22 @@ request::handle_serialized_irecv(request* self, request_action action)
BOOST_MPI_CHECK_RESULT(MPI_Test,
(self->m_requests + 1, &flag, &stat.m_status));
if (flag) {
+ if (stat.cancelled())
+ return stat;
data->deserialize(stat);
return stat;
} else
return optional<status>();
- } else {
+ }
+ else if (action == ra_cancel) {
+ status stat;
+ if (self->m_requests[0] != MPI_REQUEST_NULL)
+ BOOST_MPI_CHECK_RESULT(MPI_Cancel, (&self->m_requests[0]));
+ if (self->m_requests[1] != MPI_REQUEST_NULL)
+ BOOST_MPI_CHECK_RESULT(MPI_Cancel, (&self->m_requests[0]));
+ return optional<status>();
+ }
+ else {
return optional<status>();
}
} Something similar might be required. |
Is there a way to cancel a blocking |
I guess not, probably need to move to MPI_IMprobe. But I need o read 3.8 more carefully (I hate that section...) |
FYI, I asked the MPI community on SO. |
I'm going to check why it hang. I'd like to point out that none of Boost.MPI function is supposed to map the MPI standard regarding serialized communication as there is none in C MPI. We just try to comply to our documentation which is expected not to be surprising for an MPI user. Right now it's hanging on:
I need to check if a recv with no matching send is defined. We probably need to change that probe for it's async version and check if they were introduced at the same time. Thank for you report and test case. |
There was a proposal in #72 to make the switch between the 2 implementation available at runtime, we still have that option if it comes down to that. |
For my use case, switching at compile time would be good enough. That means that using Probe should be driven not only by the MPI supported standard but also by a user defined macro // If BOOST_MPI_USE_IMPROBE was not user defined
#if !defined(BOOST_MPI_USE_IMPROBE)
#if BOOST_MPI_VERSION >= 3
// MPI_Probe an friends should work
# if defined(I_MPI_NUMVERSION)
// Excepted for some Intel versions.
// Note that I_MPI_NUMVERSION is not always defined with Intel.
# if I_MPI_NUMVERSION > 20190004000
# define BOOST_MPI_HAVE_IMPROBE 1
# endif
# else
# define BOOST_MPI_HAVE_IMPROBE 1
# endif
#endif
#endif //!defined(BOOST_MPI_USE_IMPROBE) and use |
Would a
|
So basically, what you want is a wait those behavior is:
? |
So it should be possible to implement something sensible. The trick is to handle the status; |
I believe this is a regression from 1.70.
In the following code, an asynchronous recv is initiated then later an other thread marks the operation for cancellation (and that should guarantee
wait()
to return according to the spec).This works for fundamental type (such as
int
) but not for serialized types.The expected result is:
The text was updated successfully, but these errors were encountered: