-
Notifications
You must be signed in to change notification settings - Fork 429
Collective Operations (UCG)
UCG (G for Groups) is an experimental new layer in UCX providing collective operations support for MPI. UCG was implemented on top of the existing layers, and uses both UCP, UCT and UCS. More information on how it was designed could be found in the presentation from UCX's 2019 annual meeting.
UCG was introduced into UCX as a git-submodule under src/ucg
, when the source code actually resides in a separate repository. This means when you build the UCX - it pulls a specific version from that repository, corresponding to your UCX version.
In order to build UCX with UCG, you need to pass flags at two stages during the build:
autogen.sh --with-ucg
configure --enable-ucg <other-arguments>
Once those succeed - make
is run as usual, and builds/installs UCG as well as the other UCX components.
UCG exports an API similar to that of MPI collective operations (e.g. MPI's Allreduce). In order for this API to be called, UCG has a dedicated new component for Open MPI, but it is not upstream yet (temporary location). I'm regularly testing it with OSU, but it's still early days w.r.t. MPI applications.
- Building UCG takes forever (hangs on
builtin_data.c
?) - this is a known, unfortunate implication of the way that code was written. This one file takes some GCC versions more than an hour to build. I tried splitting that code - but it hurt performance, so no good solution quite yet.