-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADAPT: Event-driven collective implementation #7716
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me, some (probably minor) issues in the comments.
ompi_request_set_callback(send_req, send_cb, send_context); | ||
} | ||
|
||
OPAL_THREAD_LOCK(context->con->mutex_num_sent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that mutex really needed? It seems like an atomic operation would do just fine here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that technically you are correct, we could have done it with an atomic operation instead. We will revisit this code in the future and clean such cases, but I don't think it has any impact neither on correctness nor performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All points I had have been addressed, thanks @bosilca 👍
Add comments in the ADAPT module Signed-off-by: Xi Luo <[email protected]> Signed-off-by: George Bosilca <[email protected]>
* piggybacking Bull functionalities * coll/adapt: Fix naming conventions and C11 atomic use This commit fixes some naming convention issues, such as function names which should follow the naming ompi_coll_adapt instead of mca_coll_adapt, reserved for component and module naming (cf. tuned collective component); It also fixes the use of _Atomic construct, which is only valid in C11. OPAL constructs have already been adapted to that use, so use opal_atomic_* types instead. * coll/adapt: Remove unused component field in module This commit removes an unneeded field referencing the component in the module of adapt, as it is already available through the mca_coll_adapt_component global variable. Signed-off-by: Marc Sergent <[email protected]> Co-authored-by: Lemarinier, Pierre <[email protected]> Co-authored-by: pierrele <[email protected]>
API consistent with other collective modules Add comments Other minor cleanups. Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
As it is possible to have multiple outstanding non-blocking collectives provided by different collective modules, we need a consistent mechanism to allow them to select unique tags for each instance of a collective. Signed-off-by: George Bosilca <[email protected]>
Set request in ibcast.c to empty when the count is 0. Signed-off-by: Xi Luo <[email protected]> Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
The designs of MPI collectives rely on implicit synchronizations between processes due to message ordering; these designs magnify noise across the participating processes, resulting in significant performance slowdown. ADAPT presents a new perspective , using event-driven techniques to match the collective needs of data movement with the available bandwidth on the platform. The core concept of ADAPT is to relax synchronizations, while maintaining the minimal data dependencies of MPI collectives.
For more details and a performance assessment: DOI 10.1145/3208040.3208054.