Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: sicm refresh #3

Open
wants to merge 46 commits into
base: main
Choose a base branch
from
Open

WIP: sicm refresh #3

wants to merge 46 commits into from

Conversation

naughtont3
Copy link

No description provided.

EFA incorrectly implements FI_DELIVERY_COMPLETE in earlier libfabric
versions. While FI_DELIVERY_COMPLETE would be advertised by the
provider, completions would return too early by not accounting for
bounce buffers on the receive side. This would cause the BTL
to receive early completions that lead to correctness issues.

This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.

Signed-off-by: William Zhang <[email protected]>
@naughtont3
Copy link
Author

naughtont3 commented Aug 11, 2020

This is a rebased version of gv/sicm to work with latest OMPI upstream master and more recent SICM. This has only been tested for basic compile test.

@naughtont3
Copy link
Author

build-note.md.txt

wckzhang and others added 27 commits August 11, 2020 16:47
The ofi_rxm provider is dependent upon the underlying hardware for its
implementation of FI_DELIVERY_COMPLETE. Since this can lead to early
completions, we disable the provider to avoid correctness issues.

This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.

Signed-off-by: William Zhang <[email protected]>
btl/ofi: Disable EFA provider in versions earlier than libfabric 1.12.0
The C++ bindings were removed a while ago;
MPI::ERRORS_THROW_EXCEPTIONS and MPI_ERRORS_THROW_EXCEPTIONS no longer
exist.

Signed-off-by: Jeff Squyres <[email protected]>
MPI-4 is finally cleaning up its language: an MPI "exception" does not
actually exist.  The only thing that exists is an MPI "error" (and
associated handlers).  This commit replaces all relevant uses of the
word "exception" with "error".  Note that this is still applicable in
versions of the MPI standard less than MPI-4.0 (indeed, nearly all the
cases fixed in this commit are just changes to comments, anyway).

One exception to this is the Java bindings, where there's an
MPIException class.  In hindsight, it probably should have been named
MPIError, but changing it now would break anyone who is using the Java
bindings.

Signed-off-by: Jeff Squyres <[email protected]>
…rs-and-exceptions

Cleanup of MPI errors and exceptions
the ofi mtl mrecv was not properly setting the message in/out
arg to MPI_MRECV to MPI_MESSAGE_NULL.

Signed-off-by: Howard Pritchard <[email protected]>
improve configury to check whether icc is handling no long double.
This prevents seeing 100s of messages like this:

icc: command line warning open-mpi#10148: option '-Wno-long-double' not supported

A similar patch will be needed for pmix.

Signed-off-by: Howard Pritchard <[email protected]>
Add comments in the ADAPT module

Signed-off-by: Xi Luo <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
* piggybacking Bull functionalities

* coll/adapt: Fix naming conventions and C11 atomic use

This commit fixes some naming convention issues, such as function names
which should follow the naming ompi_coll_adapt instead of
mca_coll_adapt, reserved for component and module naming (cf. tuned
collective component);

It also fixes the use of _Atomic construct, which is only valid in C11.
OPAL constructs have already been adapted to that use, so use
opal_atomic_* types instead.

* coll/adapt: Remove unused component field in module

This commit removes an unneeded field referencing the component in the
module of adapt, as it is already available through the
mca_coll_adapt_component global variable.

Signed-off-by: Marc Sergent <[email protected]>
Co-authored-by: Lemarinier, Pierre <[email protected]>
Co-authored-by: pierrele <[email protected]>
API consistent with other collective modules
Add comments
Other minor cleanups.

Signed-off-by: George Bosilca <[email protected]>
As it is possible to have multiple outstanding non-blocking collectives
provided by different collective modules, we need a consistent
mechanism to allow them to select unique tags for each instance of a
collective.

Signed-off-by: George Bosilca <[email protected]>
Set request in ibcast.c to empty when the count is 0.

Signed-off-by: Xi Luo <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
Reduce scatter block and reduce scatter algorithms were hitting
correctness issues for non commutative strided tests. We will revert to
the original default algorithms for those two collectives (basic linear
and non overlapping respectively) in the non commutative op case.

See open-mpi#8010

Signed-off-by: William Zhang <[email protected]>
coll/tuned: Revert RSB and RS default algorithms
Uncovered a problem using the GNI provider with the OFI MTL.
See ofiwg/libfabric#6194.

Related to open-mpi#8001

Signed-off-by: Howard Pritchard <[email protected]>
…og_warning

suppress icc long double message
…tch_for_mtl

OFI: patch OFI MTL for GNI provider
errors_are_fatal_comm_handler takes a pointer to the error constant
Thanks to Andy Riebs for reporting that on the Open MPI user mailing list (https://www.mail-archive.com/[email protected]/msg34103.html)

Signed-off-by: Joseph Schuchart <[email protected]>
This PR removes the MCA_BTL_DES_FLAGS_PUT and MCA_BTL_DES_FLAGS_GET
descriptor flags. At some point these had some meaning but they were
replaced by the rcache access flags.

Signed-off-by: Nathan Hjelm <[email protected]>
ADAPT: Event-driven collective implementation
devreal and others added 15 commits September 4, 2020 09:31
Flushing or freeing a newly created dynamic window causes NULL to be passed.

Signed-off-by: Joseph Schuchart <[email protected]>
…-segfault

UCX: do not dereference NULL pointer in wpmem_[free|flush]
Make the C MPI_F08_status type definition match the updated
mpi_f08 type(MPI_Status) definition.

This fix the inconsistency introduced in open-mpi/ompi@98bc7af

Signed-off-by: Gilles Gouaillardet <[email protected]>
manually cleanup the generated .mod file in OMPI_FORTRAN_CHECK_BIND_C_TYPE

Signed-off-by: Gilles Gouaillardet <[email protected]>
Only in C bindings:
 - MPI_Status_c2f08()
 - MPI_Status_f082c()

In all bindings but mpif.h
 - MPI_Status_f082f()
 - MPI_Status_f2f08()

and the PMPI_* related subroutines

As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at mpi-forum/mpi-issues#298

Refs. open-mpi#1475

Signed-off-by: Gilles Gouaillardet <[email protected]>
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown.  Fix some typos and improve the text a bit along
the way.

Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).

Signed-off-by: Jeff Squyres <[email protected]>
…f08_c

Add missing MPI_Status conversion subroutines
Signed-off-by: Gilles Gouaillardet <[email protected]>
Use correct conditional variable initializer in opal/mca/pmix/base
Fix a copy/paste in the RDMA emulation.
 - Add arg for newer SICM API in sicm_arena_create()
 - Fix header path for OMPI master include 'mutex_unix.h'

Signed-off-by: Thomas Naughton <[email protected]>
@naughtont3
Copy link
Author

naughtont3 commented Sep 16, 2020

I rebased against master again to pull in fixes for RMA problems I was encountering. I also needed to move up to ucx-1.9.0rc6 to avoid hard failure. It still shows WARN msgs on finalize about inuse region (lkey/rkey), but not fatal. Tested using IMB-v2019.6/src_c/IMB-RMA Put_local across two nodes of Summit.

@naughtont3
Copy link
Author

Note - there is still some bug in here (outside of mpool changes I want to test) b/c a debug build will fail an assert in opal_common_ucx_wpool_finalize. But that's something to be fixed in master, so ignoring for now. The reason my other test passed was b/c it was a non-debug build so OMPI asserts were disabled.

 - Fixes issues encountered while testing on Summit w/ HBM
 - Avoids multiple calls to sicm_init() to set device list
 - Adds ability to have more than one device type (module list)
 - Adds call to sicm_arena_destroy() during free
 - Adds opal outputs, verbose/debug (dump device info, etc.)

Signed-off-by: Thomas Naughton <[email protected]>
Set envvar `OMPIX_MPOOL_NO_FALLBACK` to avoid fallback to malloc().
This should probably be an MCA but quick fix for mpool testing.

Signed-off-by: Thomas Naughton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.