Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

developers: Doc how to build against external PMIx/PRTE #12946

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jsquyres
Copy link
Member

@jsquyres jsquyres commented Nov 30, 2024

This has been on my to-do list for a while.

Reviewers: you can read the rendered version of this PR here: https://ompi--12946.org.readthedocs.build/en/12946/developers/building-open-mpi.html#building-against-external-openpmix-prrte

Copy link
Contributor

@rhc54 rhc54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some confusion about PRRTE - see comments.

MPI this way, you may need to install the package manager's
"developer" Hwloc, Libevent, OpenPMIx, and/or PRRTE packages.

1. Open MPI and PRRTE must be built against the **same** installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not true - PMIx is designed to handle cross-version messaging. Only issue would be ensuring that the PMIx being used by PRRTE is new enough to support any OMPI-used features. Given the minimums we set, that shouldn't be a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair point here. I was really thinking about Hwloc and Libevent when I was writing this bullet. I'll update.


Open MPI, OpenPMIx, and PRRTE must all use the same Hwloc and
Libevent libraries at run time (e.g., they must all resolve to
the same run-time loadable libraries at run time).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally correct - PRRTE doesn't have to use the same as OMPI never loads "libprrte", so there is no potential for confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By transitive property, though, isn't this true? OMPI must use the same hwloc + libevent as PMIX, and PRTE must use the same hwloc + libevent as PMIx, so therefore don't they all have to be the same?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can have two versions of PMIx installed. I don't think this is super common for a developer, but given how Slurm is packaged, may be more common there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PBS does it too now - in fact, they include PRRTE and PMIx, so not uncommon to see multiple installs there.


1. Open MPI, OpenPMIx, and PRRTE must all be built against the
**same** installation of Hwloc and Libevent. Meaning:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully correct - PRRTE doesn't need to have the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can (will) run into super weird errors if PMIx, PRTE, and/or Open MPI are using different libevents and/or hwlocs at run-time. They should be completely orthogonal, but we have definitely seen cases where they are not. I'm ok documenting it as a must even if there are some cases where it actually does work to -- for example -- have PMIx run-time link against hwloc version X and PRTE run-time link against hwloc version Y.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where you are getting into trouble. Whatever PMIx PRRTE is using, that PMIx and PRRTE must use the same hwloc and libevent.

However, as I said elsewhere, there is no requirement that PRRTE and OMPI use the same PMIx. So there is no transitive property involved here - there is a complete airbreak between PRRTE and OMPI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so something like:

  1. Open MPI and the OpenPMIx library Open MPI links against must be built against the same installation of hwloc and Libevent. It is not required that the runtime and Open MPI be built with the same version of PMIx, but the same hwloc/libevent linking rules also apply to PRRTE and its OPenPMIx library.

docs/developers/building-open-mpi.rst Show resolved Hide resolved
manager. Assuming that the package-manager installs of OpenPMIx
and PRRTE were built against the package-manager-provider Hwloc
and Libevent, then Open MPI will *also* need to be built against
the package-manager-provided Hwloc and Libevent. To build Open
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, true for PMIx but not for PRRTE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants