Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask/Distributed migration plan #311

Open
pentschev opened this issue Oct 31, 2024 · 0 comments
Open

Dask/Distributed migration plan #311

pentschev opened this issue Oct 31, 2024 · 0 comments

Comments

@pentschev
Copy link
Member

The next milestone for UCXX is become the default UCX comms interface in Distributed. Currently there are two ways to use UCX:

  1. protocol="ucx": Uses the legacy UCX-Py code, as has been the case for the past several years;
  2. protocol="ucxx": Uses the UCXX library, which requires the distributed-ucxx package to be installed.

Since now the implementation of the comms protocol is part of the UCXX repo, it is also a standalone package and doesn't come anymore with the installation of the distributed packages via conda, PyPI or source, the user must install distributed-ucxx from the packaging repo or source.

Migration options:

  1. Add a warning to UCX-Py telling users to migrate as it will be deprecated/removed soon and simply remove protocol="ucx" and force users to switch to protocol="ucxx" when the time comes; or
  2. Introduce a proxy comms protocol="ucx" (replacing the current one which directly points to UCX-Py) which then chooses UCXX if distributed-ucxx is installed (as if protocol="ucxx" was specified), otherwise fallback to UCX-Py and warns the user to install distributed-ucxx. This is the most transparent way and allows us to control the time of final switch, plus RAPIDS already has distributed-ucxx installed so it should not be noticed by users, only those who pick a subset of packages will be affected and warned.

From previous conversations, primarily with @charlesbluca and @rjzamora , option 2 is preferred with a PoC of the necessary changes in Distributed here.

Once the migration is complete we will archive the UCX-Py repository and remove the UCX-Py implementation from distributed along with its tests, which should help alleviate overhead from maintainers in the Distributed repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant