Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document multi-node with Dask on Databricks #228

Closed
Tracked by #296
jacobtomlinson opened this issue May 2, 2023 · 1 comment
Closed
Tracked by #296

Document multi-node with Dask on Databricks #228

jacobtomlinson opened this issue May 2, 2023 · 1 comment
Assignees
Labels
doc Improvements or additions to documentation platform/databricks

Comments

@jacobtomlinson
Copy link
Member

jacobtomlinson commented May 2, 2023

Our current Databricks documentation shows how to launch a Databricks cluster (single and multi-node) with the RAPIDS environment. However it doesn't actually discuss how to leverage the multi-node cluster with Dask.

We need to document this.

I expect this will involve using dask-yarn to start the scheduler and worker processes and will need to use dask-cuda-worker as their startup script. Databricks doesn't use Yarn, we would need to explore whatever it does use.

Related issue #231 covers documenting the single-node setup.

@jacobtomlinson jacobtomlinson added doc Improvements or additions to documentation platform/databricks labels May 2, 2023
@beckernick
Copy link
Member

Should we file a separate issue for documenting how to use a single-node Dask cluster on Databricks?

Should it be as simple as spinning up a LocalCUDACluster on the node? A quick Google search provides few results/examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Improvements or additions to documentation platform/databricks
Projects
None yet
Development

No branches or pull requests

3 participants