Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable support for PVCs in Ray cluster nodes #1087

Closed
2 tasks done
blublinsky opened this issue May 12, 2023 · 3 comments
Closed
2 tasks done

Enable support for PVCs in Ray cluster nodes #1087

blublinsky opened this issue May 12, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@blublinsky
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Ray 1.3+ spills objects to external storage once the object store is full - https://docs.ray.io/en/latest/ray-core/objects/object-spilling.html. This is especially important when Ray datasets are used, as they extensively use object store. In the case of KubeRay, by default, a minimal local disk is mapped. Fortunately, object spilling can be enabled to a specific directory or S3. Spilling to S3, although an option, can be quite slow, so spilling to disk is a better option and can be done through the volume mount and configuring spilling to a specified directory (via RayStartParameters).
Kubernetes provides 3 options for mapping disk to a pod:

  1. Using local ephemeral storage https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage. Nodes have local ephemeral storage, backed by locally-attached writeable devices or, sometimes, by RAM, which can be mapped. This works, but unfortunately, the amount of such disk space is quite limited
  2. Using ephemeral volumes https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#csi-ephemeral-volumes, more specifically - Generic ephemeral volumes. Unfortunately, this feature is quite new and is not widely supported
  3. Using PVCs. This is probably the most general solution.

Use case

Object spilling is a prominent use case, but other options of mapping additional storage for execution can be quite useful

Related issues

The issue with PVC with Pods is that a separate PVC has to be created explicitly for every pod and only then it can be used by the pod, referencing it by name. The two options are to pass an operator a list of PVCs that are defined outside or create PVC with appropriate parameters as part of the pod creation. The first option seems to be very error-prone and I think that the second one is a by far better solution

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@blublinsky blublinsky added the enhancement New feature or request label May 12, 2023
@psschwei
Copy link
Contributor

This is a feature I'd also be interested in seeing added.

cc @Jeffwan re:

// TODO(Jeffwan@): handle PVC in the future

@psschwei
Copy link
Contributor

I went ahead and opened a PR for this, as it's something we could use sooner rather than latter.

@kevin85421
Copy link
Member

@blublinsky has identified that this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants