RAPIDS/Dask Databricks MNMG #296

jacobtomlinson · 2023-10-18T09:50:24Z

This issue is intended to be a high-level tracking issue for supporting Dask RAPIDS deployments on Databricks.

High-level goals

Make it easy for users to launch a RAPIDS Dask cluster on an MNMG Databricks cluster.
Support running both Dask RAPIDS and Spark RAPIDS on the same MNMG Databricks cluster.
Show a workflow that uses both Spark RAPIDS and Dask RAPIDS together (or show alternate implementations of the same thing but using the same cluster)
Show a workflow that uses dask-deltatable to read data from Detla Lake

Databricks multi-node technical architecture

When you launch a multi-node cluster on Databricks a Spark driver node and many Spark worker nodes are provisioned. When you run a notebook session the notebook kernel is executed on the driver node and the Spark cluster can be leveraged using pyspark.

To use Spark RAPIDS with this Databricks cluster the user must provide an init script that downloads the rapids-4-spark-xxxx.jar plugin and then configure Spark to load this plugin. Spark queries will then leverage libcudf under the hood and benefit from GPU acceleration.

Adding Dask to Databricks clusters

The cluster architecture of having a driver and workers is also the same as a Dask cluster which has a scheduler and workers on different nodes, and we see from the Spark RAPIDS example that Databricks provides a mechanism to run a script on every node at startup to install plugins.

This paradigm of having a script run on every node at startup means that we could use this to create a Dask Runner (xref dask/community#346) that bootstraps a Dask cluster as part of the init script process.

When the init script is run it appears to have access to a number of environment variables. The most useful of these are DB_IS_DRIVER and DB_DRIVER_IP. These variables provide all the information we need to ensure that the driver node starts a dask scheduler process, and the worker nodes create a dask cuda worker process that connects to the scheduler. These processes would need to be started in the background so that the init script can complete and move on with the rest of the Spark cluster startup.

Then when using a notebook with the kernel being run on the driver node we should be able to use dask_cudf and other RAPIDS Dask integrations to communicate with the Dask scheduler on localhost, the same way pyspark does.

Deliverables

Tasks

Give feedback

Create init script that prints out who is the scheduler and who is the workers
Create an init script (ideally in Python) that starts a Dask cluster as background processes
Package that script as a Python library/CLI for easy distribution
Document multi-node with Dask on Databricks #228

doc platform/databricks
Add a workflow example that uses multi-node Databricks and Dask (ideally also using dask-deltatable) #298

3 of 3
Write a blog post announcing RAPIDS Dask support on Databricks #312
Options

Other info

@jacobtomlinson has started experimenting with the Dask Runners concept in https://github.com/jacobtomlinson/dask-runners (which will likely move to the dask-contrib org at some point). This is probably a reasonable place to prototype this in a PR.

The text was updated successfully, but these errors were encountered:

jacobtomlinson added feature request New feature or request platform/databricks tracker labels Oct 18, 2023

jacobtomlinson assigned skirui-source Oct 19, 2023

jacobtomlinson changed the title ~~Create Databricks MNMG Runner~~ RAPIDS/Dask Databricks MNMG Jan 4, 2024

jacobtomlinson closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAPIDS/Dask Databricks MNMG #296

RAPIDS/Dask Databricks MNMG #296

jacobtomlinson commented Oct 18, 2023 •

edited

Loading

Tasks

RAPIDS/Dask Databricks MNMG #296

RAPIDS/Dask Databricks MNMG #296

Comments

jacobtomlinson commented Oct 18, 2023 • edited Loading

High-level goals

Databricks multi-node technical architecture

Adding Dask to Databricks clusters

Deliverables

Tasks

Other info

jacobtomlinson commented Oct 18, 2023 •

edited

Loading