[release] Redirect users to Ray website (ray-project#1431)

[release] Redirect users to Ray website
kevin85421 · Oct 17, 2023 · 11bfdfa · 11bfdfa
1 parent 9794249
commit 11bfdfa
Showing 20 changed files with 20 additions and 2,903 deletions.
diff --git a/docs/guidance/FAQ.md b/docs/guidance/FAQ.md
@@ -1,60 +1 @@
-# Frequently Asked Questions
-
-Welcome to the Frequently Asked Questions page for KubeRay. This document addresses common inquiries.
-If you don't find an answer to your question here, please don't hesitate to connect with us via our [community channels](https://github.com/ray-project/kuberay#getting-involved).
-
-# Contents
-- [Worker init container](#worker-init-container)
-- [Cluster domain](#cluster-domain)
-- [RayService](#rayservice)
-
-## Worker init container
-
-The KubeRay operator will inject a default [init container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) into every worker Pod. 
-This init container is responsible for waiting until the Global Control Service (GCS) on the head Pod is ready before establishing a connection to the head.
-The init container will use `ray health-check` to check the GCS server status continuously.
-
-The default worker init container may not work for all use cases, or users may want to customize the init container.
-
-### 1. Init container troubleshooting
-
-Some common causes for the worker init container to stuck in `Init:0/1` status are:
-
-* The GCS server process has failed in the head Pod. Please inspect the log directory `/tmp/ray/session_latest/logs/` in the head Pod for errors related to the GCS server.
-* The `ray` executable is not included in the `$PATH` for the image, so the init container will fail to run `ray health-check`.
-* The `CLUSTER_DOMAIN` environment variable is not set correctly. See the section [cluster domain](#cluster-domain) for more details.
-* The worker init container shares the same ***ImagePullPolicy***, ***SecurityContext***, ***Env***, ***VolumeMounts***, and ***Resources*** as the worker Pod template. Sharing these settings is possible to cause a deadlock. See [#1130](https://github.com/ray-project/kuberay/issues/1130) for more details.
-
-If the init container remains stuck in `Init:0/1` status for 2 minutes, we will stop redirecting the output messages to `/dev/null` and instead print them to the worker Pod logs.
-To troubleshoot further, you can inspect the logs using `kubectl logs`.
-
-### 2. Disable the init container injection
-
-If you want to customize the worker init container, you can disable the init container injection and add your own.
-To disable the injection, set the `ENABLE_INIT_CONTAINER_INJECTION` environment variable in the KubeRay operator to `false` (applicable from KubeRay v0.5.2).
-Please refer to [#1069](https://github.com/ray-project/kuberay/pull/1069) and the [KubeRay Helm chart](https://github.com/ray-project/kuberay/blob/ddb5e528c29c2e1fb80994f05b1bd162ecbaf9f2/helm-chart/kuberay-operator/values.yaml#L83-L87) for instructions on how to set the environment variable.
-Once disabled, you can add your custom init container to the worker Pod template.
-
-## Cluster domain
-
-In KubeRay, we use Fully Qualified Domain Names (FQDNs) to establish connections between workers and the head.
-The FQDN of the head service is `${HEAD_SVC}.${NAMESPACE}.svc.${CLUSTER_DOMAIN}`.
-The default [cluster domain](https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#introduction) is `cluster.local`, which works for most Kubernetes clusters.
-However, it's important to note that some clusters may have a different cluster domain.
-You can check the cluster domain of your Kubernetes cluster by checking `/etc/resolv.conf` in a Pod.
-
-To set a custom cluster domain, adjust the `CLUSTER_DOMAIN` environment variable in the KubeRay operator.
-Helm chart users can make this modification [here](https://github.com/ray-project/kuberay/blob/ddb5e528c29c2e1fb80994f05b1bd162ecbaf9f2/helm-chart/kuberay-operator/values.yaml#L88-L91).
-For more information, please refer to [#951](https://github.com/ray-project/kuberay/pull/951) and [#938](https://github.com/ray-project/kuberay/pull/938) for more details.
-
-## RayService
-
-RayService is a Custom Resource Definition (CRD) designed for Ray Serve. In KubeRay, creating a RayService will first create a RayCluster and then
-create Ray Serve applications once the RayCluster is ready. If the issue pertains to the data plane, specifically your Ray Serve scripts 
-or Ray Serve configurations (`serveConfigV2`), troubleshooting may be challenging. See [rayservice-troubleshooting](rayservice-troubleshooting.md) for more details.
-
-## Questions
-
-### Why are my changes to RayCluster/RayJob CR not taking effect?
-
-Currently, only modifications to the `replicas` field in `RayCluster/RayJob` CR are supported. Changes to other fields may not take effect or could lead to unexpected results.
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/troubleshooting/troubleshooting.html#kuberay-troubleshootin-guides).
diff --git a/docs/guidance/autoscaler.md b/docs/guidance/autoscaler.md
@@ -1,111 +1 @@
-## Autoscaler (beta)
-
-Ray Autoscaler integration is beta since KubeRay 0.3.0 and Ray 2.0.0.
-While autoscaling functionality is stable, the details of autoscaler behavior and configuration may change in future releases.
-
-See the [official Ray documentation](https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/configuring-autoscaling.html) for even more information about Ray autoscaling on Kubernetes.
-
-### Prerequisite
-
-* Follow this [document](https://github.com/ray-project/kuberay/blob/master/helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-
-### Deploy a cluster with autoscaling enabled
-
-Next, to deploy a sample autoscaling Ray cluster, run
-```
-kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.5/ray-operator/config/samples/ray-cluster.autoscaler.yaml
-```
-
-See the above config file for details on autoscaling configuration.
-
-!!! note
-
-    Ray container resource requests and limits in the example configuration above are too small
-    to be used in production. For typical use-cases, you should use large Ray pods. If possible,
-    each Ray pod should be sized to take up its entire K8s node. We don't recommend
-    allocating less than 8 gigabytes of memory for Ray containers running in production.
-    For an autoscaling configuration more suitable for production, see
-    [ray-cluster.autoscaler.large.yaml](https://raw.githubusercontent.com/ray-project/kuberay/release-0.5/ray-operator/config/samples/ray-cluster.autoscaler.large.yaml).
-
-The output of `kubectl get pods` should indicate the presence of
-a Ray head pod with two containers,
-the Ray container and the autoscaler container.
-You should also see a Ray worker pod with a single Ray container.
-
-
-```
-$ kubectl get pods
-NAME                                             READY   STATUS    RESTARTS   AGE
-raycluster-autoscaler-head-mgwwk                 2/2     Running   0          4m41s
-raycluster-autoscaler-worker-small-group-fg4fv   1/1     Running   0          4m41s
-```
-
-Check the autoscaler container's logs to confirm that the autoscaler is healthy.
-Here's an example of logs from a healthy autoscaler.
-```
-kubectl logs -f raycluster-autoscaler-head-mgwwk autoscaler
-
-2022-03-10 07:51:22,616	INFO monitor.py:226 -- Starting autoscaler metrics server on port 44217
-2022-03-10 07:51:22,621	INFO monitor.py:243 -- Monitor: Started
-2022-03-10 07:51:22,824	INFO node_provider.py:143 -- Creating KuberayNodeProvider.
-2022-03-10 07:51:22,825	INFO autoscaler.py:282 -- StandardAutoscaler: {'provider': {'type': 'kuberay', 'namespace': 'default', 'disable_node_updaters': True, 'disable_launch_config_check': True}, 'cluster_name': 'raycluster-autoscaler', 'head_node_type': 'head-group', 'available_node_types': {'head-group': {'min_workers': 0, 'max_workers': 0, 'node_config': {}, 'resources': {'CPU': 1}}, 'small-group': {'min_workers': 1, 'max_workers': 300, 'node_config': {}, 'resources': {'CPU': 1}}}, 'max_workers': 300, 'idle_timeout_minutes': 5, 'upscaling_speed': 1, 'file_mounts': {}, 'cluster_synced_files': [], 'file_mounts_sync_continuously': False, 'initialization_commands': [], 'setup_commands': [], 'head_setup_commands': [], 'worker_setup_commands': [], 'head_start_ray_commands': [], 'worker_start_ray_commands': [], 'auth': {}, 'head_node': {}, 'worker_nodes': {}}
-2022-03-10 07:51:23,027	INFO autoscaler.py:327 --
-======== Autoscaler status: 2022-03-10 07:51:23.027271 ========
-Node status
----------------------------------------------------------------
-Healthy:
- 1 head-group
-Pending:
- (no pending nodes)
-Recent failures:
- (no failures)
-
-Resources
----------------------------------------------------------------
-Usage:
- 0.0/1.0 CPU
- 0.00/0.931 GiB memory
- 0.00/0.200 GiB object_store_memory
-
-Demands:
- (no resource demands)
-```
-
-#### Notes
-
-1. To enable autoscaling, set your RayCluster CR's `spec.enableInTreeAutoscaling` field to true.
-   The operator will then automatically inject a preconfigured autoscaler container to the head pod.
-   The service account, role, and role binding needed by the autoscaler will be created by the operator out-of-box.
-   The operator will also configure an empty-dir logging volume for the Ray head pod. The volume will be mounted into the Ray and
-   autoscaler containers; this is necessary to support the event logging introduced in [Ray PR #13434](https://github.com/ray-project/ray/pull/13434).
-
-    ```
-    spec:
-      enableInTreeAutoscaling: true
-    ```
-
-2. If your RayCluster CR's `spec.rayVersion` field is at least `2.0.0`, the autoscaler container will use the same image as the Ray container.
-   For Ray versions older than 2.0.0, the image `rayproject/ray:2.0.0` will be used to run the autoscaler.
-
-3. Autoscaling functionality is supported only with Ray versions at least as new as 1.11.0. Autoscaler support
-   is beta as of Ray 2.0.0 and KubeRay 0.3.0; while autoscaling functionality is stable, the details of autoscaler behavior and configuration may change in future releases.
-
-### Test autoscaling
-
-Let's now try out the autoscaler. Run the following commands to scale up the cluster:
-
-```
-export HEAD_POD=$(kubectl get pods -o custom-columns=POD:metadata.name | grep raycluster-autoscaler-head)
-kubectl exec $HEAD_POD -it -c ray-head -- python -c "import ray;ray.init();ray.autoscaler.sdk.request_resources(num_cpus=4)"
-```
-
-You should then see two extra Ray nodes (pods) scale up to satisfy the 4 CPU demand.
-
-```
-$ kubectl get pods
-NAME                                             READY   STATUS    RESTARTS   AGE
-raycluster-autoscaler-head-mgwwk                 2/2     Running   0          4m41s
-raycluster-autoscaler-worker-small-group-4d255   1/1     Running   0          40s
-raycluster-autoscaler-worker-small-group-fg4fv   1/1     Running   0          4m41s
-raycluster-autoscaler-worker-small-group-qzhvg   1/1     Running   0          40s
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/configuring-autoscaling.html#kuberay-autoscaling).
diff --git a/docs/guidance/aws-eks-gpu-cluster.md b/docs/guidance/aws-eks-gpu-cluster.md
@@ -1,70 +1 @@
-# Start Amazon EKS Cluster with GPUs for KubeRay
-
-## Step 1: Create a Kubernetes cluster on Amazon EKS
-
-Follow the first two steps in [this AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html#)
-to: (1) create your Amazon EKS cluster and (2) configure your computer to communicate with your cluster.
-
-## Step 2: Create node groups for the Amazon EKS cluster
-
-Follow "Step 3: Create nodes" in [this AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html#) to create node groups. The following section provides more detailed information.
-
-### Create a CPU node group
-
-Typically, avoid running GPU workloads on the Ray head. Create a CPU node group for all Pods except Ray GPU 
-workers, such as the KubeRay operator, Ray head, and CoreDNS Pods.
-
-Here's a common configuration that works for most KubeRay examples in the docs:
-  * Instance type: [**m5.xlarge**](https://aws.amazon.com/ec2/instance-types/m5/) (4 vCPU; 16 GB RAM)
-  * Disk size: 256 GB
-  * Desired size: 1, Min size: 0, Max size: 1
-
-### Create a GPU node group
-
-Create a GPU node group for Ray GPU workers.
-
-1. Here's a common configuration that works for most KubeRay examples in the docs:
-   * AMI type: Bottlerocket NVIDIA (BOTTLEROCKET_x86_64_NVIDIA)
-   * Instance type: [**g5.xlarge**](https://aws.amazon.com/ec2/instance-types/g5/) (1 GPU; 24 GB GPU Memory; 4 vCPUs; 16 GB RAM)
-   * Disk size: 1024 GB
-   * Desired size: 1, Min size: 0, Max size: 1
-
-> **Note:** If you encounter permission issues with `kubectl`, follow "Step 2: Configure your computer to communicate with your cluster"
-in the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html#).
-
-2. Please install the NVIDIA device plugin. Note: You don't need this if you used `BOTTLEROCKET_x86_64_NVIDIA` image in above step
-   * Install the DaemonSet for NVIDIA device plugin to run GPU enabled containers in your Amazon EKS cluster. You can refer to the [Amazon EKS optimized accelerated Amazon Linux AMIs](https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html#gpu-ami)
-   or [NVIDIA/k8s-device-plugin](https://github.com/NVIDIA/k8s-device-plugin) repository for more details.
-   * If the GPU nodes have taints, add `tolerations` to `nvidia-device-plugin.yml` to enable the DaemonSet to schedule Pods on the GPU nodes.
-
-```sh
-# Install the DaemonSet
-kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.9.0/nvidia-device-plugin.yml
-
-# Verify that your nodes have allocatable GPUs. If the GPU node fails to detect GPUs,
-# please verify whether the DaemonSet schedules the Pod on the GPU node.
-kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
-
-# Example output:
-# NAME                                GPU
-# ip-....us-west-2.compute.internal   4
-# ip-....us-west-2.compute.internal   <none>
-```
-
-3. Add a Kubernetes taint to prevent scheduling CPU Pods on this GPU node group. For KubeRay examples, add the following taint to the GPU nodes: `Key: ray.io/node-type, Value: worker, Effect: NoSchedule`, and include the corresponding `tolerations` for GPU Ray worker Pods.
-
-> Warning: GPU nodes are extremely expensive. Please remember to delete the cluster if you no longer need it.
-
-## Step 3: Verify the node groups
-
-> **Note:** If you encounter permission issues with `eksctl`, navigate to your AWS account's webpage and copy the
-credential environment variables, including `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_SESSION_TOKEN`,
-from the "Command line or programmatic access" page.
-
-```sh
-eksctl get nodegroup --cluster ${YOUR_EKS_NAME}
-
-# CLUSTER         NODEGROUP       STATUS  CREATED                 MIN SIZE        MAX SIZE        DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID                        ASG NAME                           TYPE
-# ${YOUR_EKS_NAME}     cpu-node-group  ACTIVE  2023-06-05T21:31:49Z    0               1               1                       m5.xlarge       AL2_x86_64                      eks-cpu-node-group-...     managed
-# ${YOUR_EKS_NAME}     gpu-node-group  ACTIVE  2023-06-05T22:01:44Z    0               1               1                       g5.12xlarge     BOTTLEROCKET_x86_64_NVIDIA      eks-gpu-node-group-...     managed
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/k8s-cluster-setup.html#kuberay-k8s-setup).
diff --git a/docs/guidance/gcp-gke-gpu-cluster.md b/docs/guidance/gcp-gke-gpu-cluster.md
@@ -1,74 +1 @@
-# Start Google Cloud GKE Cluster with GPUs for KubeRay
-
-## Step 1: Create a Kubernetes cluster on GKE
-
-Run this command and all following commands on your local machine or on the [Google Cloud Shell](https://cloud.google.com/shell). If running from your local machine, you will need to install the [Google Cloud SDK](https://cloud.google.com/sdk/docs/install). The following command creates a Kubernetes cluster named `kuberay-gpu-cluster` with 1 CPU node in the `us-west1-b` zone. In this example, we use the `e2-standard-4` machine type, which has 4 vCPUs and 16 GB RAM.
-
-```sh
-gcloud container clusters create kuberay-gpu-cluster \
-    --num-nodes=1 --min-nodes 0 --max-nodes 1 --enable-autoscaling \
-    --zone=us-west1-b --machine-type e2-standard-4
-```
-
-> Note: You can also create a cluster from the [Google Cloud Console](https://console.cloud.google.com/kubernetes/list).
-
-## Step 2: Create a GPU node pool
-
-Run the following command to create a GPU node pool for Ray GPU workers.
-(You can also create it from the Google Cloud Console; see the [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/node-taints#create_a_node_pool_with_node_taints) for more details.)
-
-```sh
-gcloud container node-pools create gpu-node-pool \
-  --accelerator type=nvidia-l4-vws,count=1 \
-  --zone us-west1-b \
-  --cluster kuberay-gpu-cluster \
-  --num-nodes 1 \
-  --min-nodes 0 \
-  --max-nodes 1 \
-  --enable-autoscaling \
-  --machine-type g2-standard-4 \
-  --node-taints=ray.io/node-type=worker:NoSchedule 
-```
-
-The `--accelerator` flag specifies the type and number of GPUs for each node in the node pool. In this example, we use the [NVIDIA L4](https://cloud.google.com/compute/docs/gpus#l4-gpus) GPU. The machine type `g2-standard-4` has 1 GPU, 24 GB GPU Memory, 4 vCPUs and 16 GB RAM.
-
-The taint `ray.io/node-type=worker:NoSchedule` prevents CPU-only Pods such as the Kuberay operator, Ray head, and CoreDNS Pods from being scheduled on this GPU node pool. This is because GPUs are expensive, so we want to use this node pool for Ray GPU workers only.
-
-Concretely, any Pod that does not have the following toleration will not be scheduled on this GPU node pool:
-
-```yaml
-tolerations:
-- key: ray.io/node-type
-  operator: Equal
-  value: worker
-  effect: NoSchedule
-```
-
-For more on taints and tolerations, see the [Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/).
-
-## Step 3: Configure `kubectl` to connect to the cluster
-
-Run the following command to download Google Cloud credentials and configure the Kubernetes CLI to use them.
-
-```sh
-gcloud container clusters get-credentials kuberay-gpu-cluster --zone us-west1-b
-```
-
-For more details, see the [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl).
-
-## Step 4: Install NVIDIA GPU device drivers
-
-This step is required for GPU support on GKE. See the [GKE documentation](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers) for more details.
-
-```sh
-# Install NVIDIA GPU device driver
-kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml
-
-# Verify that your nodes have allocatable GPUs 
-kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
-
-# Example output:
-# NAME                                          GPU
-# gke-kuberay-gpu-cluster-gpu-node-pool-xxxxx   1
-# gke-kuberay-gpu-cluster-default-pool-xxxxx    <none>
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/k8s-cluster-setup.html#kuberay-k8s-setup).
diff --git a/docs/guidance/gcs-ft.md b/docs/guidance/gcs-ft.md
@@ -1,118 +1 @@
-## Ray GCS Fault Tolerance (GCS FT) （Beta release）
-
-> **Note**: This feature is beta.
-
-Ray GCS FT enables GCS server to use external storage backend. As a result, Ray clusters can tolerate GCS failures and recover from failures
-without affecting important services such as detached Actors & RayServe deployments.
-
-### Prerequisite
-
-* Ray 2.0 is required.
-* You need to support external Redis server for Ray. (Redis HA cluster is highly recommended.)
-
-### Enable Ray GCS FT
-
-To enable Ray GCS FT in your newly KubeRay-managed Ray cluster, you need to enable it by adding an annotation to the
-RayCluster YAML file.
-
-```yaml
-...
-kind: RayCluster
-metadata:
-  annotations:
-    ray.io/ft-enabled: "true" # <- add this annotation enable GCS FT
-    ray.io/external-storage-namespace: "my-raycluster-storage-namespace" # <- optional, to specify the external storage namespace
-...
-```
-An example can be found at [ray-cluster.external-redis.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-cluster.external-redis.yaml)
-
-When annotation `ray.io/ft-enabled` is added with a `true` value, KubeRay will enable Ray GCS FT feature. This feature
-contains several components:
-
-1. Newly created Ray cluster has `Readiness Probe` and `Liveness Probe` added to all the head/worker nodes.
-2. KubeRay Operator controller watches for `Event` object changes which can notify in case of readiness probe failures and mark them as `Unhealthy`.
-3. KubeRay Operator controller kills and recreate any `Unhealthy` Ray head/worker node.
-
-### Implementation Details
-
-#### Readiness Probe vs Liveness Probe
-
-These are the two types of probes we used in Ray GCS FT. 
-
-The readiness probe is used to notify KubeRay in case of failures in the corresponding Ray cluster. KubeRay can try its best to
-recover the Ray cluster. If KubeRay cannot recover the failed head/worker node, the liveness probe gets in, delete the old pod
-and create a new pod.
-
-By default, the liveness probe gets involved later than the readiness probe. The liveness probe is our last resort to recover the 
-Ray cluster. However, in our current implementation, for the readiness probe failures, we also kill & recreate the corresponding pod that runs head/worker node.
-
-Currently, the readiness probe and the liveness probe are using the same command to do the work. In the future, we may run
- different commands for the readiness probe and the liveness probe.
-
-On Ray head node, we access a local Ray dashboard http endpoint and a Raylet http endpoint to make sure this head node is in
-healthy state. Since Ray dashboard does not reside Ray worker node, we only check the local Raylet http endpoint to make sure
-the worker node is healthy.
-
-#### Ray GCS FT Annotation
-
-Our Ray GCS FT feature checks if an annotation called `ray.io/ft-enabled` is set to `true` in `RayCluster` YAML file. If so, KubeRay
-will also add such annotation to the pod whenever the head/worker node is created.
-
-#### Use External Redis Cluster
-
-To use external Redis cluster as the backend storage(required by Ray GCS FT),
-you need to add `RAY_REDIS_ADDRESS` environment variable to the head node template.
-
-Also, you can specify a storage namespace for your Ray cluster by using an annotation `ray.io/external-storage-namespace`
-
-An example can be found at [ray-cluster.external-redis.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-cluster.external-redis.yaml)
-
-To use SSL/TLS in the connection, you add `rediss://` as the prefix of the redis address instead of the `redis://` prefix. This feature is only available in Ray 2.2 and above.
-
-You can also specify additional environment variables in the head pod to customize the SSL configuration:
-
-- `RAY_REDIS_CA_CERT` The location of the CA certificate (optional)
-- `RAY_REDIS_CA_PATH` Path of trusted certificates (optional)
-- `RAY_REDIS_CLIENT_CERT` File name of client certificate file (optional)
-- `RAY_REDIS_CLIENT_KEY` File name of client private key (optional)
-- `RAY_REDIS_SERVER_NAME` Server name to request (SNI) (optional)
-
-
-#### KubeRay Operator Controller
-
-KubeRay Operator controller watches for new `Event` reconcile call. If this Event object is to notify the failed readiness probe,
-controller checks if this pod has `ray.io/ft-enabled` set to `true`. If this pod has this annotation set to true, that means this pod
-belongs to a Ray cluster that has Ray GCS FT enabled.
-
-After this, the controller will try to recover the failed pod. If controller cannot recover it, an annotation named 
-`ray.io/health-state` with a value `Unhealthy` is added to this pod.
-
-In every KubeRay Operator controller reconcile loop, it monitors any pod in Ray cluster that has `Unhealthy` value in annotation
-`ray.io/health-state`. If any pod is found, this pod is deleted and gets recreated.
-
-#### External Storage Namespace
-
-External storage namespaces can be used to share a single storage backend among multiple Ray clusters. By default, `ray.io/external-storage-namespace`
-uses the RayCluster UID as its value when GCS FT is enabled. Or if the user wants to use customized external storage namespace,
-the user can add `ray.io/external-storage-namespace` annotation to RayCluster yaml file.
-
-Whenever `ray.io/external-storage-namespace` annotation is set, the head/worker node will have `RAY_external_storage_namespace` environment
-variable set which Ray can pick up later.
-
-#### Known issues and limitations
-
-1. For now, Ray head/worker node that fails the readiness probe recovers itself by restarting itself. More fine-grained control and recovery mechanisms are expected in the future.
-
-### Test Ray GCS FT
-
-Currently, two tests are responsible for ensuring Ray GCS FT is working correctly.
-
-1. Detached actor test
-2. RayServe test
-
-In detached actor test, a detached actor is created at first. Then, the head node is killed. KubeRay brings back another
-head node replacement pod. However, the detached actor is still expected to be available. (Note: the client that creates
-the detached actor does not exist and will retry in case of Ray cluster returns failure)
-
-In RayServe test, a simple RayServe app is deployed on the Ray cluster. In case of GCS server crash, the RayServe app
-continues to be accessible after the head node recovery.
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/kuberay-gcs-ft.html#kuberay-gcs-ft).
diff --git a/docs/guidance/ingress.md b/docs/guidance/ingress.md
@@ -1,136 +1 @@
-## Ingress Usage
-
-Here we provide some examples to show how to use ingress to access your Ray cluster.
-
-  * [Example: AWS Application Load Balancer (ALB) Ingress support on AWS EKS](#example-aws-application-load-balancer-alb-ingress-support-on-aws-eks)
-  * [Example: Manually setting up NGINX Ingress on KinD](#example-manually-setting-up-nginx-ingress-on-kind)
-
-
-> :warning: **Only expose Ingresses to authorized users.** The Ray Dashboard provides read and write access to the Ray Cluster. Anyone with access to this Ingress can execute arbitrary code on the Ray Cluster.
-
-### Example: AWS Application Load Balancer (ALB) Ingress support on AWS EKS
-#### Prerequisite
-* Follow the document [Getting started with Amazon EKS – AWS Management Console and AWS CLI](https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html#eks-configure-kubectl) to create an EKS cluster.
-
-* Follow the [installation instructions](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/deploy/installation/) to set up the [AWS Load Balancer controller](https://github.com/kubernetes-sigs/aws-load-balancer-controller). Note that the repository maintains a webpage for each release. Please make sure you use the latest installation instructions.
-
-* (Optional) Try [echo server example](https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/docs/examples/echo_server.md) in the [aws-load-balancer-controller](https://github.com/kubernetes-sigs/aws-load-balancer-controller) repository.
-
-* (Optional) Read [how-it-works.md](https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/docs/how-it-works.md) to understand the mechanism of [aws-load-balancer-controller](https://github.com/kubernetes-sigs/aws-load-balancer-controller).
-
-#### Instructions
-```sh
-# Step 1: Install KubeRay operator and CRD
-pushd helm-chart/kuberay-operator/
-helm install kuberay-operator .
-popd
-
-# Step 2: Install a RayCluster
-pushd helm-chart/ray-cluster
-helm install ray-cluster .
-popd
-
-# Step 3: Edit the `ray-operator/config/samples/ray-cluster-alb-ingress.yaml`
-#
-# (1) Annotation `alb.ingress.kubernetes.io/subnets`
-#   1. Please include at least two subnets.
-#   2. One Availability Zone (ex: us-west-2a) can only have at most 1 subnet.
-#   3. In this example, you need to select public subnets (subnets that "Auto-assign public IPv4 address" is Yes on AWS dashboard)
-#
-# (2) Set the name of head pod service to `spec...backend.service.name`
-eksctl get cluster ${YOUR_EKS_CLUSTER} # Check subnets on the EKS cluster
-
-# Step 4: Check ingress created by Step 4.
-kubectl describe ingress ray-cluster-ingress
-
-# [Example]
-# Name:             ray-cluster-ingress
-# Labels:           <none>
-# Namespace:        default
-# Address:          k8s-default-rayclust-....${REGION_CODE}.elb.amazonaws.com
-# Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
-# Rules:
-#  Host        Path  Backends
-#  ----        ----  --------
-#  *
-#              /   ray-cluster-kuberay-head-svc:8265 (192.168.185.157:8265)
-# Annotations: alb.ingress.kubernetes.io/scheme: internet-facing
-#              alb.ingress.kubernetes.io/subnets: ${SUBNET_1},${SUBNET_2}
-#              alb.ingress.kubernetes.io/tags: Environment=dev,Team=test
-#              alb.ingress.kubernetes.io/target-type: ip
-# Events:
-#   Type    Reason                  Age   From     Message
-#   ----    ------                  ----  ----     -------
-#   Normal  SuccessfullyReconciled  39m   ingress  Successfully reconciled
-
-# Step 6: Check ALB on AWS (EC2 -> Load Balancing -> Load Balancers)
-#        The name of the ALB should be like "k8s-default-rayclust-......".
-
-# Step 7: Check Ray Dashboard by ALB DNS Name. The name of the DNS Name should be like
-#        "k8s-default-rayclust-.....us-west-2.elb.amazonaws.com"
-
-# Step 8: Delete the ingress, and AWS Load Balancer controller will remove ALB.
-#        Check ALB on AWS to make sure it is removed.
-kubectl delete ingress ray-cluster-ingress
-```
-
-### Example: Manually setting up NGINX Ingress on KinD
-```sh
-# Step 1: Create a KinD cluster with `extraPortMappings` and `node-labels`
-# Reference for the setting up of kind cluster: https://kind.sigs.k8s.io/docs/user/ingress/
-cat <<EOF | kind create cluster --config=-
-kind: Cluster
-apiVersion: kind.x-k8s.io/v1alpha4
-nodes:
-- role: control-plane
-  kubeadmConfigPatches:
-  - |
-    kind: InitConfiguration
-    nodeRegistration:
-      kubeletExtraArgs:
-        node-labels: "ingress-ready=true"
-  extraPortMappings:
-  - containerPort: 80
-    hostPort: 80
-    protocol: TCP
-  - containerPort: 443
-    hostPort: 443
-    protocol: TCP
-EOF
-
-# Step 2: Install NGINX ingress controller
-kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml
-sleep 10 # Wait for the Kubernetes API Server to create the related resources
-kubectl wait --namespace ingress-nginx \
-  --for=condition=ready pod \
-  --selector=app.kubernetes.io/component=controller \
-  --timeout=90s
-
-# Step 3: Install KubeRay operator
-pushd helm-chart/kuberay-operator
-helm install kuberay-operator .
-popd
-
-# Step 4: Install RayCluster and create an ingress separately. 
-# If you want to change ingress settings, you can edit the ingress portion in 
-# `ray-operator/config/samples/ray-cluster.separate-ingress.yaml`.
-# More information about change of setting was documented in https://github.com/ray-project/kuberay/pull/699 
-# and `ray-operator/config/samples/ray-cluster.separate-ingress.yaml`
-kubectl apply -f ray-operator/config/samples/ray-cluster.separate-ingress.yaml
-
-# Step 5: Check the ingress created in Step 4.
-kubectl describe ingress raycluster-ingress-head-ingress
-
-# [Example]
-# ...
-# Rules:
-# Host        Path  Backends
-# ----        ----  --------
-# *
-#             /raycluster-ingress/(.*)   raycluster-ingress-head-svc:8265 (10.244.0.11:8265)
-# Annotations:  nginx.ingress.kubernetes.io/rewrite-target: /$1
-
-# Step 6: Check `<ip>/raycluster-ingress/` on your browser. You will see the Ray Dashboard.
-#        [Note] The forward slash at the end of the address is necessary. `<ip>/raycluster-ingress`
-#               will report "404 Not Found".
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/ingress.html#kuberay-ingress).
diff --git a/docs/guidance/kubeflow-integration.md b/docs/guidance/kubeflow-integration.md
@@ -1,109 +1 @@
-> Credit: This manifest refers a lot to the engineering blog ["Building a Machine Learning Platform with Kubeflow and Ray on Google Kubernetes Engine"](https://cloud.google.com/blog/products/ai-machine-learning/build-a-ml-platform-with-kubeflow-and-ray-on-gke) from Google Cloud.
-
-# Kubeflow: an interactive development solution
-
-The [Kubeflow](https://www.kubeflow.org/) project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.
-
-# Requirements
-* Dependencies
-    * `kustomize`: v3.2.0 (Kubeflow manifest is sensitive to `kustomize` version.)
-    * `Kubernetes`: v1.23
-
-* Computing resources:
-    * 16GB RAM
-    * 8 CPUs
-
-# Example: Use Kubeflow to provide an interactive development envirzonment
-![image](../images/architecture.svg)
-
-## Step 1: Create a Kubernetes cluster with Kind.
-```sh
-# Kubeflow is sensitive to Kubernetes version and Kustomize version.
-kind create cluster --image=kindest/node:v1.23.0
-kustomize version --short
-# 3.2.0
-```
-
-## Step 2: Install Kubeflow v1.6-branch
-* This example installs Kubeflow with the [v1.6-branch](https://github.com/kubeflow/manifests/tree/v1.6-branch).
-
-* Install all Kubeflow official components and all common services using [one command](https://github.com/kubeflow/manifests/tree/v1.6-branch#install-with-a-single-command).
-    * If you do not want to install all components, you can comment out **KNative**, **Katib**, **Tensorboards Controller**, **Tensorboard Web App**, **Training Operator**, and **KServe** from [example/kustomization.yaml](https://github.com/kubeflow/manifests/blob/v1.6-branch/example/kustomization.yaml).
-
-## Step 3: Install KubeRay operator
-* Follow this [document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-
-## Step 4: Install RayCluster
-```sh
-# Create a RayCluster CR, and the KubeRay operator will reconcile a Ray cluster
-# with 1 head Pod and 1 worker Pod.
-helm install raycluster kuberay/ray-cluster --version 0.6.0 --set image.tag=2.2.0-py38-cpu
-
-# Check RayCluster
-kubectl get pod -l ray.io/cluster=raycluster-kuberay
-# NAME                                          READY   STATUS    RESTARTS   AGE
-# raycluster-kuberay-head-bz77b                 1/1     Running   0          64s
-# raycluster-kuberay-worker-workergroup-8gr5q   1/1     Running   0          63s
-```
-
-* This step uses `rayproject/ray:2.2.0-py38-cpu` as its image. Ray is very sensitive to the Python versions and Ray versions between the server (RayCluster) and client (JupyterLab) sides. This image uses:
-    * Python 3.8.13
-    * Ray 2.2.0
-
-## Step 5: Forward the port of Istio's Ingress-Gateway
-* Follow the [instructions](https://github.com/kubeflow/manifests/tree/v1.6-branch#port-forward) to forward the port of Istio's Ingress-Gateway and log in to Kubeflow Central Dashboard.
-
-## Step 6: Create a JupyterLab via Kubeflow Central Dashboard
-* Click "Notebooks" icon in the left panel.
-* Click "New Notebook"
-* Select `kubeflownotebookswg/jupyter-scipy:v1.6.1` as OCI image.
-* Click "Launch"
-* Click "CONNECT" to connect into the JupyterLab instance.
-
-## Step 7: Use Ray client in the JupyterLab to connect to the RayCluster
-> Warning: Ray client has some known [limitations](https://docs.ray.io/en/latest/cluster/running-applications/job-submission/ray-client.html#things-to-know) and is not actively maintained.
-
-* As mentioned in Step 4, Ray is very sensitive to the Python versions and Ray versions between the server (RayCluster) and client (JupyterLab) sides. Open a terminal in the JupyterLab:
-    ```sh
-    # Check Python version. The version's MAJOR and MINOR should match with RayCluster (i.e. Python 3.8)
-    python --version 
-    # Python 3.8.10
-
-    # Install Ray 2.2.0
-    pip install -U ray[default]==2.2.0
-    ```
-* Connect to RayCluster via Ray client.
-    ```python
-    # Open a new .ipynb page.
-
-    import ray
-    # ray://${RAYCLUSTER_HEAD_SVC}.${NAMESPACE}.svc.cluster.local:${RAY_CLIENT_PORT}
-    ray.init(address="ray://raycluster-kuberay-head-svc.default.svc.cluster.local:10001")
-    print(ray.cluster_resources())
-    # {'node:10.244.0.41': 1.0, 'memory': 3000000000.0, 'node:10.244.0.40': 1.0, 'object_store_memory': 805386239.0, 'CPU': 2.0}
-
-    # Try Ray task
-    @ray.remote
-    def f(x):
-        return x * x
-
-    futures = [f.remote(i) for i in range(4)]
-    print(ray.get(futures)) # [0, 1, 4, 9]
-
-    # Try Ray actor
-    @ray.remote
-    class Counter(object):
-        def __init__(self):
-            self.n = 0
-
-        def increment(self):
-            self.n += 1
-
-        def read(self):
-            return self.n
-
-    counters = [Counter.remote() for i in range(4)]
-    [c.increment.remote() for c in counters]
-    futures = [c.read.remote() for c in counters]
-    print(ray.get(futures)) # [1, 1, 1, 1]
-    ```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/kubeflow.html).
diff --git a/docs/guidance/mobilenet-rayservice.md b/docs/guidance/mobilenet-rayservice.md
@@ -1,42 +1 @@
-# Serve a MobileNet image classifier using RayService
-
-> **Note:** The Python files for the Ray Serve application and its client are in the repository [ray-project/serve_config_examples](https://github.com/ray-project/serve_config_examples).
-
-## Step 1: Create a Kubernetes cluster with Kind.
-
-```sh
-kind create cluster --image=kindest/node:v1.23.0
-```
-
-## Step 2: Install KubeRay operator
-
-Follow [this document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-Please note that the YAML file in this example uses `serveConfigV2`, which is supported starting from KubeRay v0.6.0.
-
-## Step 3: Install a RayService
-
-```sh
-# path: ray-operator/config/samples/
-kubectl apply -f ray-service.mobilenet.yaml
-```
-
-* The [mobilenet.py](https://github.com/ray-project/serve_config_examples/blob/master/mobilenet/mobilenet.py) file requires `tensorflow` as a dependency. Hence, the YAML file uses `rayproject/ray-ml:2.5.0` instead of `rayproject/ray:2.5.0`.
-* `python-multipart` is required for the request parsing function `starlette.requests.form()`, so the YAML file includes `python-multipart` in the runtime environment.
-
-## Step 4: Forward the port of Serve
-
-```sh
-kubectl port-forward svc/rayservice-mobilenet-serve-svc 8000
-```
-
-Note that the Serve service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
-
-## Step 5: Send a request to the ImageClassifier
-
-* Step 5.1: Prepare an image file.
-* Step 5.2: Update `image_path` in [mobilenet_req.py](https://github.com/ray-project/serve_config_examples/blob/master/mobilenet/mobilenet_req.py)
-* Step 5.3: Send a request to the `ImageClassifier`.
-  ```sh
-  python mobilenet_req.py
-  # sample output: {"prediction":["n02099601","golden_retriever",0.17944198846817017]}
-  ```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/examples/mobilenet-rayservice.html#kuberay-mobilenet-rayservice-example).
diff --git a/docs/guidance/pod-command.md b/docs/guidance/pod-command.md
@@ -1,150 +1 @@
-# Specify container commands for Ray head/worker Pods
-You can execute commands on the head/worker pods at two timings:
-
-* (1) **Before `ray start`**: As an example, you can set up some environment variables that will be used by `ray start`.
-
-* (2) **After `ray start` (RayCluster is ready)**: As an example, you can launch a Ray serve deployment when the RayCluster is ready.
-
-## Current KubeRay operator behavior for container commands
-* The current behavior for container commands is not finalized, and **may be updated in the future**.
-* See [code](https://github.com/ray-project/kuberay/blob/47148921c7d14813aea26a7974abda7cf22bbc52/ray-operator/controllers/ray/common/pod.go#L301-L326) for more details.
-
-## Timing 1: Before `ray start`
-Currently, for timing (1), we can set the container's `Command` and `Args` in RayCluster specification to reach the goal.
-
-```yaml
-# ray-operator/config/samples/ray-cluster.head-command.yaml
-    rayStartParams:
-        ...
-    #pod template
-    template:
-      spec:
-        containers:
-        - name: ray-head
-          image: rayproject/ray:2.6.3
-          resources:
-            ...
-          ports:
-            ...
-          # `command` and `args` will become a part of `spec.containers.0.args` in the head Pod.
-          command: ["echo 123"]
-          args: ["456"]
-```
-
-* Ray head Pod
-    * `spec.containers.0.command` is hardcoded with `["/bin/bash", "-lc", "--"]`.
-    * `spec.containers.0.args` contains two parts:
-        * (Part 1) **user-specified command**: A string concatenates `headGroupSpec.template.spec.containers.0.command` from RayCluster and `headGroupSpec.template.spec.containers.0.args` from RayCluster together.
-        * (Part 2) **ray start command**: The command is created based on `rayStartParams` specified in RayCluster. The command will look like `ulimit -n 65536; ray start ...`.
-        * To summarize, `spec.containers.0.args` will be `$(user-specified command) && $(ray start command)`.
-
-* Example
-    ```sh
-    # Prerequisite: There is a KubeRay operator in the Kubernetes cluster.
-
-    # Path: kuberay/
-    kubectl apply -f ray-operator/config/samples/ray-cluster.head-command.yaml
-
-    # Check ${RAYCLUSTER_HEAD_POD}
-    kubectl get pod -l ray.io/node-type=head
-
-    # Check `spec.containers.0.command` and `spec.containers.0.args`.
-    kubectl describe pod ${RAYCLUSTER_HEAD_POD}
-
-    # Command:
-    #   /bin/bash
-    #   -lc
-    #   --
-    # Args:
-    #    echo 123  456  && ulimit -n 65536; ray start --head  --dashboard-host=0.0.0.0  --num-cpus=1  --block  --metrics-export-port=8080  --memory=2147483648
-    ```
-
-
-## Timing 2: After `ray start` (RayCluster is ready)
-We have two solutions to execute commands after the RayCluster is ready. The main difference between these two solutions is users can check the logs via `kubectl logs` with Solution 1.
-
-### Solution 1: Container command (Recommended)
-As we mentioned in the section "Timing 1: Before `ray start`", user-specified command will be executed before the `ray start` command. Hence, we can execute the `ray_cluster_resources.sh` in background by updating `headGroupSpec.template.spec.containers.0.command` in `ray-cluster.head-command.yaml`.
-
-```yaml
-# ray-operator/config/samples/ray-cluster.head-command.yaml
-# Parentheses for the command is required.
-command: ["(/home/ray/samples/ray_cluster_resources.sh&)"]
-
-# ray_cluster_resources.sh
-apiVersion: v1
-kind: ConfigMap
-metadata:
-  name: ray-example
-data:
-  ray_cluster_resources.sh: |
-    #!/bin/bash
-
-    # wait for ray cluster to finish initialization
-    while true; do
-        ray health-check 2>/dev/null
-        if [ "$?" = "0" ]; then
-            break
-        else
-            echo "INFO: waiting for ray head to start"
-            sleep 1
-        fi
-    done
-
-    # Print the resources in the ray cluster after the cluster is ready.
-    python -c "import ray; ray.init(); print(ray.cluster_resources())"
-
-    echo "INFO: Print Ray cluster resources"
-```
-
-* Example
-    ```sh
-    # Path: kuberay/
-    # (1) Update `command` to ["(/home/ray/samples/ray_cluster_resources.sh&)"]
-    # (2) Comment out `postStart` and `args`.
-    kubectl apply -f ray-operator/config/samples/ray-cluster.head-command.yaml
-
-    # Check ${RAYCLUSTER_HEAD_POD}
-    kubectl get pod -l ray.io/node-type=head
-
-    # Check the logs
-    kubectl logs ${RAYCLUSTER_HEAD_POD}
-
-    # INFO: waiting for ray head to start
-    # .
-    # . => Cluster initialization
-    # .
-    # 2023-02-16 18:44:43,724 INFO worker.py:1231 -- Using address 127.0.0.1:6379 set in the environment variable RAY_ADDRESS
-    # 2023-02-16 18:44:43,724 INFO worker.py:1352 -- Connecting to existing Ray cluster at address: 10.244.0.26:6379...
-    # 2023-02-16 18:44:43,735 INFO worker.py:1535 -- Connected to Ray cluster. View the dashboard at http://10.244.0.26:8265
-    # {'object_store_memory': 539679129.0, 'node:10.244.0.26': 1.0, 'CPU': 1.0, 'memory': 2147483648.0}
-    # INFO: Print Ray cluster resources
-    ```
-
-### Solution 2: postStart hook
-```yaml
-# ray-operator/config/samples/ray-cluster.head-command.yaml
-lifecycle:
-  postStart:
-    exec:
-      command: ["/bin/sh","-c","/home/ray/samples/ray_cluster_resources.sh"]
-```
-
-* We execute the script `ray_cluster_resources.sh` via the postStart hook. Based on [this document](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks), there is no guarantee that the hook will execute before the container ENTRYPOINT. Hence, we need to wait for RayCluster to finish initialization in `ray_cluster_resources.sh`.
-
-* Example
-    ```sh
-    # Path: kuberay/
-    kubectl apply -f ray-operator/config/samples/ray-cluster.head-command.yaml
-
-    # Check ${RAYCLUSTER_HEAD_POD}
-    kubectl get pod -l ray.io/node-type=head
-
-    # Forward the port of Dashboard
-    kubectl port-forward --address 0.0.0.0 ${RAYCLUSTER_HEAD_POD} 8265:8265
-
-    # Open the browser and check the Dashboard (${YOUR_IP}:8265/#/job).
-    # You shold see a SUCCEEDED job with the following Entrypoint:
-    #
-    # `python -c "import ray; ray.init(); print(ray.cluster_resources())"`
-    ```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/pod-command.html#kuberay-pod-command).
diff --git a/docs/guidance/pod-security.md b/docs/guidance/pod-security.md
@@ -1,121 +1 @@
-# Pod Security
-
-Kubernetes defines three different Pod Security Standards, including `privileged`, `baseline`, and `restricted`, to broadly
-cover the security spectrum. The `privileged` standard allows users to do known privilege escalations, and thus it is not 
-safe enough for security-critical applications.
-
-This document describes how to configure RayCluster YAML file to apply `restricted` Pod security standard. The following 
-references can help you understand this document better:
-
-* [Kubernetes - Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted)
-* [Kubernetes - Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/)
-* [Kubernetes - Auditing](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/)
-* [KinD - Auditing](https://kind.sigs.k8s.io/docs/user/auditing/)
-
-# Step 1: Create a KinD cluster
-```bash
-# Path: kuberay/
-kind create cluster --config ray-operator/config/security/kind-config.yaml --image=kindest/node:v1.24.0
-```
-The `kind-config.yaml` enables audit logging with the audit policy defined in `audit-policy.yaml`. The `audit-policy.yaml`
-defines an auditing policy to listen to the Pod events in the namespace `pod-security`. With this policy, we can check
-whether our Pods violate the policies in `restricted` standard or not.
-
-The feature [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) is firstly 
-introduced in Kubernetes v1.22 (alpha) and becomes stable in Kubernetes v1.25. In addition, KubeRay currently supports 
-Kubernetes from v1.19 to v1.24. (At the time of writing, we have not tested KubeRay with Kubernetes v1.25). Hence, I use **Kubernetes v1.24** in this step.
-
-# Step 2: Check the audit logs
-```bash
-docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log
-```
-The log should be empty because the namespace `pod-security` does not exist.
-
-# Step 3: Create the `pod-security` namespace
-```bash
-kubectl create ns pod-security
-kubectl label --overwrite ns pod-security \
-  pod-security.kubernetes.io/warn=restricted \
-  pod-security.kubernetes.io/warn-version=latest \
-  pod-security.kubernetes.io/audit=restricted \
-  pod-security.kubernetes.io/audit-version=latest \
-  pod-security.kubernetes.io/enforce=restricted \
-  pod-security.kubernetes.io/enforce-version=latest
-```
-With the `pod-security.kubernetes.io` labels, the built-in Kubernetes Pod security admission controller will apply the 
-`restricted` Pod security standard to all Pods in the namespace `pod-security`. The label
-`pod-security.kubernetes.io/enforce=restricted` means that the Pod will be rejected if it violate the policies defined in 
-`restricted` security standard. See [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) for more details about the labels.
-
-# Step 4: Install the KubeRay operator
-```bash
-# Update the field securityContext in helm-chart/kuberay-operator/values.yaml
-securityContext:
-  allowPrivilegeEscalation: false
-  capabilities:
-    drop: ["ALL"]
-  runAsNonRoot: true
-  seccompProfile:
-    type: RuntimeDefault
-
-# Path: kuberay/helm-chart/kuberay-operator
-helm install -n pod-security kuberay-operator .
-```
-
-# Step 5: Create a RayCluster (Choose either Step 5.1 or Step 5.2)
-* If you choose Step 5.1, no Pod will be created in the namespace `pod-security`.
-* If you choose Step 5.2, Pods can be created successfully.
-
-## Step 5.1: Create a RayCluster without proper `securityContext` configurations
-```bash
-# Path: kuberay/ray-operator/config/samples
-kubectl apply -n pod-security -f ray-cluster.complete.yaml
-
-# Wait 20 seconds and check audit logs for the error messages.
-docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log
-
-# Example error messagess
-# "pods \"raycluster-complete-head-fkbf5\" is forbidden: violates PodSecurity \"restricted:latest\": allowPrivilegeEscalation != false (container \"ray-head\" must set securityContext.allowPrivilegeEscalation=false) ...
-
-kubectl get pod -n pod-security
-# NAME                               READY   STATUS    RESTARTS   AGE
-# kuberay-operator-8b6d55dbb-t8msf   1/1     Running   0          62s
-
-# Clean up the RayCluster
-kubectl delete rayclusters.ray.io -n pod-security raycluster-complete
-# raycluster.ray.io "raycluster-complete" deleted
-```
-No Pod will be created in the namespace `pod-security`, and check audit logs for error messages.
-
-## Step 5.2: Create a RayCluster with proper `securityContext` configurations
-```bash
-# Path: kuberay/ray-operator/config/security
-kubectl apply -n pod-security -f ray-cluster.pod-security.yaml
-
-# Wait for the RayCluster convergence and check audit logs for the messages.
-docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log
-
-# Forward the dashboard port
-kubectl port-forward --address 0.0.0.0 svc/raycluster-pod-security-head-svc -n pod-security 8265:8265
-
-# Log in to the head Pod
-kubectl exec -it -n pod-security ${YOUR_HEAD_POD} -- bash
-
-# (Head Pod) Run a sample job in the Pod
-python3 samples/xgboost_example.py
-
-# Check the job status in the dashboard on your browser.
-# http://127.0.0.1:8265/#/job => The job status should be "SUCCEEDED".
-
-# (Head Pod) Make sure Python dependencies can be installed under `restricted` security standard 
-pip3 install jsonpatch
-echo $? # Check the exit code of `pip3 install jsonpatch`. It should be 0.
-
-# Clean up the RayCluster
-kubectl delete -n pod-security -f ray-cluster.pod-security.yaml
-# raycluster.ray.io "raycluster-pod-security" deleted
-# configmap "xgboost-example" deleted
-```
-One head Pod and one worker Pod will be created as specified in `ray-cluster.pod-security.yaml`.
-First, we log in to the head Pod, run a XGBoost example script, and check the job
-status in the dashboard. Next, we use `pip` to install a Python dependency (i.e. `jsonpatch`), and the exit code of the `pip` command should be 0.
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/pod-security.html).
diff --git a/docs/guidance/profiling.md b/docs/guidance/profiling.md
@@ -1,64 +1 @@
-# Profiling with KubeRay
-
-## Stack trace and CPU profiling
-[py-spy](https://github.com/benfred/py-spy/tree/master) is a sampling profiler for Python programs. It lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way. This section describes how to configure RayCluster YAML file to enable py-spy and see Stack Trace and CPU Flame Graph via Ray Dashboard.
-
-### **Prerequisite**
-py-spy requires the `SYS_PTRACE` capability to read process memory. However, Kubernetes omits this capability by default. To enable profiling, add the following to the `template.spec.containers` for both the head and worker Pods.
-
-```bash
-securityContext:
-  capabilities:
-    add:
-    - SYS_PTRACE
-```
-**Notes:**
-- Adding `SYS_PTRACE` is forbidden under `baseline` and `restricted` Pod Security Standards. See [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) for more details.
-
-### **Steps to deploy and test the RayCluster with `SYS_PTRACE` capability**
-
-1. **Create a KinD cluster**:
-    ```bash
-    kind create cluster
-    ```
-
-2. **Install the KubeRay operator**:
-
-    Follow the steps in [Installation Guide](https://github.com/ray-project/kuberay/blob/master/helm-chart/kuberay-operator/README.md#install-crds-and-kuberay-operator).
-
-3. **Create a RayCluster with `SYS_PTRACE` capability**:
-    ```bash
-    # Path: kuberay/ray-operator/config/samples
-    kubectl apply -f ray-cluster.py-spy.yaml
-    ```
-
-4. **Forward the dashboard port**:
-    ```bash
-    kubectl port-forward --address 0.0.0.0 svc/raycluster-py-spy-head-svc 8265:8265
-    ```
-
-5. **Run a sample job within the head Pod**:
-    ```bash
-    # Log in to the head Pod
-    kubectl exec -it ${YOUR_HEAD_POD} -- bash
-
-    # (Head Pod) Run a sample job in the Pod
-    # `long_running_task` includes a `while True` loop to ensure the task remains actively running indefinitely. 
-    # This allows you ample time to view the Stack Trace and CPU Flame Graph via Ray Dashboard.
-    python3 samples/long_running_task.py
-    ```
-    **Notes:**
-    - If you're running your own examples and encounter the error `Failed to write flamegraph: I/O error: No stack counts found` when viewing CPU Flame Graph, it might be due to the process being idle. Notably, using the `sleep` function can lead to this state. In such situations, py-spy filters out the idle stack traces. Refer to this [issue](https://github.com/benfred/py-spy/issues/321#issuecomment-731848950) for more information.
-
-6. **Profile using Ray Dashboard**:
-    - Visit http://localhost:8265/#/cluster.
-    - Click `Stack Trace` for `ray::long_running_task`.
-      ![StackTrace](../images/stack_trace.png)
-    - Click `CPU Flame Graph` for `ray::long_running_task`.
-      ![FlameGraph](../images/cpu_flame_graph.png)
-    - For additional details on using the profiler, refer the [Ray Observability Guide](https://docs.ray.io/en/latest/ray-observability/user-guides/debug-apps/optimize-performance.html#python-cpu-profiling-in-the-dashboard).
-
-7. **Clean up the RayCluster**: 
-    ```bash
-    kubectl delete -f ray-cluster.py-spy.yaml
-    ```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/pyspy.html#kuberay-pyspy-integration).
diff --git a/docs/guidance/prometheus-grafana.md b/docs/guidance/prometheus-grafana.md
diff --git a/docs/guidance/rayjob.md b/docs/guidance/rayjob.md
@@ -1,150 +1 @@
-# Ray Job (alpha)
-
-> Note: This is the alpha version of Ray Job Support in KubeRay. There will be ongoing improvements for Ray Job in the future releases.
-
-## Prerequisites
-
-* Ray 1.10 or higher
-* KubeRay v0.3.0+. (v0.6.0+ is recommended)
-
-## What is a RayJob?
-
-A RayJob manages 2 things:
-
-* Ray Cluster: Manages resources in a Kubernetes cluster.
-* Job: Manages jobs in a Ray Cluster.
-
-### What does the RayJob provide?
-
-* **Kubernetes-native support for Ray clusters and Ray Jobs.** You can use a Kubernetes config to define a Ray cluster and job, and use `kubectl` to create them. The cluster can be deleted automatically once the job is finished.
-
-## Deploy KubeRay
-
-Make sure your KubeRay operator version is at least v0.3.0.
-The latest released KubeRay version is recommended.
-For installation instructions, please follow [the documentation](../deploy/installation.md).
-
-## Run an example Job
-
-There is one example config file to deploy a RayJob included here:
-[ray_v1alpha1_rayjob.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray_v1alpha1_rayjob.yaml)
-
-```shell
-# Create a RayJob.
-$ kubectl apply -f config/samples/ray_v1alpha1_rayjob.yaml
-```
-
-```shell
-# List running RayJobs.
-$ kubectl get rayjob
-NAME            AGE
-rayjob-sample   7s
-```
-
-```shell
-# RayJob sample will also create a raycluster.
-# raycluster will create few resources including pods and services. You can use the following commands to check them:
-$ kubectl get rayclusters
-$ kubectl get pod
-```
-
-## RayJob Configuration
-
-* `entrypoint` - The shell command to run for this job.
-* `rayClusterSpec` - The spec for the Ray cluster to run the job on.
-* `jobId` - _(Optional)_ Job ID to specify for the job. If not provided, one will be generated.
-* `metadata` - _(Optional)_ Arbitrary user-provided metadata for the job.
-* `runtimeEnvYAML` - _(Optional)_ The runtime environment configuration provided as a multi-line YAML string. _(New in KubeRay version 1.0.)_
-* `shutdownAfterJobFinishes` - _(Optional)_ whether to recycle the cluster after the job finishes. Defaults to false.
-* `ttlSecondsAfterFinished` - _(Optional)_ TTL to clean up the cluster. This only works if `shutdownAfterJobFinishes` is set.
-* `submitterPodTemplate` - _(Optional)_ Pod template spec for the pod that runs `ray job submit` against the Ray cluster.
-* `runtimeEnv` - [DEPRECATED] _(Optional)_ base64-encoded string of the runtime env json string.
-* `entrypointNumCpus` - _(Optional)_ Specifies the quantity of CPU cores to reserve for the entrypoint command.
-* `entrypointNumGpus` - _(Optional)_ Specifies the number of GPUs to reserve for the entrypoint command.
-* `entrypointResources` - _(Optional)_ A json formatted dictionary to specify custom resources and their quantity.
-
-## RayJob Observability
-
-You can use `kubectl logs` to check the operator logs or the head/worker nodes logs.
-You can also use `kubectl describe rayjobs rayjob-sample` to check the states and event logs of your RayJob instance:
-
-```text
-Status:
-  Dashboard URL:          rayjob-sample-raycluster-v6qcq-head-svc.default.svc.cluster.local:8265
-  End Time:               2023-07-11T17:39:56Z
-  Job Deployment Status:  Complete
-  Job Id:                 rayjob-sample-66z5m
-  Job Status:             SUCCEEDED
-  Message:                Job finished successfully.
-  Observed Generation:    2
-  Ray Cluster Name:       rayjob-sample-raycluster-v6qcq
-  Ray Cluster Status:
-    Available Worker Replicas:  1
-    Desired Worker Replicas:    1
-    Endpoints:
-      Client:        10001
-      Dashboard:     8265
-      Gcs - Server:  6379
-      Metrics:       8080
-      Serve:         8000
-    Head:
-      Pod IP:             10.244.0.6
-      Service IP:         10.96.31.68
-    Last Update Time:     2023-07-11T17:39:32Z
-    Max Worker Replicas:  5
-    Min Worker Replicas:  1
-    Observed Generation:  1
-    State:                ready
-  Start Time:             2023-07-11T17:39:39Z
-Events:
-  Type    Reason   Age    From               Message
-  ----    ------   ----   ----               -------
-  Normal  Created  3m37s  rayjob-controller  Created cluster rayjob-sample-raycluster-v6qcq
-  Normal  Created  2m11s  rayjob-controller  Created k8s job rayjob-sample
-  Normal  Deleted  107s   rayjob-controller  Deleted cluster rayjob-sample-raycluster-v6qcq
-```
-
-If the job doesn't run successfully, the above `describe` command will provide information about that too:
-
-```text
-Status:
-  Dashboard URL:          rayjob-sample-raycluster-2h7ds-head-svc.default.svc.cluster.local:8265
-  End Time:               2023-07-11T17:51:31Z
-  Job Deployment Status:  Complete
-  Job Id:                 rayjob-sample-prbts
-  Job Status:             FAILED
-  Message:                Job failed due to an application error, last available logs (truncated to 20,000 chars):
-python: can't open file '/home/ray/samples/sample_code.ppy': [Errno 2] No such file or directory
-
-  Observed Generation:  2
-  Ray Cluster Name:     rayjob-sample-raycluster-2h7ds
-  Ray Cluster Status:
-    Available Worker Replicas:  1
-    Desired Worker Replicas:    1
-    Endpoints:
-      Client:        10001
-      Dashboard:     8265
-      Gcs - Server:  6379
-      Metrics:       8080
-      Serve:         8000
-    Head:
-      Pod IP:             10.244.0.7
-      Service IP:         10.96.24.232
-    Last Update Time:     2023-07-11T17:51:12Z
-    Max Worker Replicas:  5
-    Min Worker Replicas:  1
-    Observed Generation:  1
-    State:                ready
-  Start Time:             2023-07-11T17:51:16Z
-Events:
-  Type    Reason   Age    From               Message
-  ----    ------   ----   ----               -------
-  Normal  Created  3m57s  rayjob-controller  Created cluster rayjob-sample-raycluster-2h7ds
-  Normal  Created  2m31s  rayjob-controller  Created k8s job rayjob-sample
-```
-
-## Delete the RayJob instance
-
-```shell
-kubectl delete -f config/samples/ray_v1alpha1_rayjob.yaml
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/getting-started/rayjob-quick-start.html).
diff --git a/docs/guidance/rayserve-dev-doc.md b/docs/guidance/rayserve-dev-doc.md
@@ -1,129 +1 @@
-# Developing Ray Serve Python scripts on a RayCluster
-
-In this tutorial, you will learn how to effectively debug your Ray Serve scripts against a RayCluster, enabling enhanced observability and faster iteration speed compared to developing the script directly with a RayService.
-Many RayService issues are related to the Ray Serve Python scripts, so it is important to ensure the correctness of the scripts before deploying them to a RayService.
-This tutorial will show you how to develop a Ray Serve Python script for a MobileNet image classifier on a RayCluster.
-You can deploy and serve the classifier on your local Kind cluster without requiring a GPU.
-Please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md) for more details.
-
-
-# Step 1: Install a KubeRay cluster
-
-Follow this [document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-
-# Step 2: Create a RayCluster CR
-
-```sh
-helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0
-```
-
-# Step 3: Log in to the head Pod
-
-```sh
-export HEAD_POD=$(kubectl get pods --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers)
-kubectl exec -it $HEAD_POD -- bash
-```
-
-# Step 4: Prepare your Ray Serve Python scripts and run the Ray Serve application
-
-```sh
-# Execute the following command in the head Pod
-git clone https://github.com/ray-project/serve_config_examples.git
-cd serve_config_examples
-
-# Try to launch the Ray Serve application
-serve run mobilenet.mobilenet:app
-# [Error message]
-#     from tensorflow.keras.preprocessing import image
-# ModuleNotFoundError: No module named 'tensorflow'
-```
-
-* `serve run mobilenet.mobilenet:app`: The first `mobilenet` is the name of the directory in the `serve_config_examples/`,
-the second `mobilenet` is the name of the Python file in the directory `mobilenet/`, and `app` is the name of the variable representing Ray Serve application within the Python file. See the section "import_path" in [rayservice-troubleshooting.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayservice-troubleshooting.md) for more details.
-
-# Step 5: Change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`
-
-```sh
-# Uninstall RayCluster
-helm uninstall raycluster
-
-# Install the RayCluster CR with the Ray image `rayproject/ray-ml:${RAY_VERSION}`
-helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0 --set image.repository=rayproject/ray-ml
-```
-
-The error message in Step 4 indicates that the Ray image `rayproject/ray:${RAY_VERSION}` does not have the TensorFlow package.
-Due to the significant size of TensorFlow, we have opted to use an image with TensorFlow as the base instead of installing it within the Ray [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments).
-In this Step, we will change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`.
-
-# Step 6: Repeat Step 3 and Step 4
-
-```sh
-# Repeat Step 3 and Step 4 to log in to the new head Pod and run the Ray Serve application.
-# You should successfully launch the Ray Serve application this time.
-serve run mobilenet.mobilenet:app
-
-# [Example output]
-# (ServeReplica:default_ImageClassifier pid=139, ip=10.244.0.8) Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
-#     8192/14536120 [..............................] - ETA: 0s)
-#  4202496/14536120 [=======>......................] - ETA: 0s)
-# 12902400/14536120 [=========================>....] - ETA: 0s)
-# 14536120/14536120 [==============================] - 0s 0us/step
-# 2023-07-17 14:04:43,737 SUCC scripts.py:424 -- Deployed Serve app successfully.
-```
-
-# Step 7: Submit a request to the Ray Serve application
-
-```sh
-# (On your local machine) Forward the serve port of the head Pod
-kubectl port-forward --address 0.0.0.0 $HEAD_POD 8000
-
-# Clone the repository on your local machine
-git clone https://github.com/ray-project/serve_config_examples.git
-cd serve_config_examples/mobilenet
-
-# Prepare a sample image file. `stable_diffusion_example.png` is a cat image generated by the Stable Diffusion model.
-curl -O https://raw.githubusercontent.com/ray-project/kuberay/master/docs/images/stable_diffusion_example.png
-
-# Update `image_path` in `mobilenet_req.py` to the path of `stable_diffusion_example.png`
-# Send a request to the Ray Serve application.
-python3 mobilenet_req.py
-
-# [Error message]
-# Unexpected error, traceback: ray::ServeReplica:default_ImageClassifier.handle_request() (pid=139, ip=10.244.0.8)
-#   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/utils.py", line 254, in wrap_to_ray_error
-#     raise exception
-#   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/replica.py", line 550, in invoke_single
-#     result = await method_to_call(*args, **kwargs)
-#   File "./mobilenet/mobilenet.py", line 24, in __call__
-#   File "/home/ray/anaconda3/lib/python3.7/site-packages/starlette/requests.py", line 256, in _get_form
-#     ), "The `python-multipart` library must be installed to use form parsing."
-# AssertionError: The `python-multipart` library must be installed to use form parsing..
-```
-
-`python-multipart` is required for the request parsing function `starlette.requests.form()`, so the error message is reported when we send a request to the Ray Serve application.
-
-# Step 8: Restart the Ray Serve application with runtime environment.
-
-```sh
-# In the head Pod, stop the Ray Serve application
-serve shutdown
-
-# Check the Ray Serve application status
-serve status
-# [Example output]
-# There are no applications running on this cluster.
-
-# Launch the Ray Serve application with runtime environment.
-serve run mobilenet.mobilenet:app --runtime-env-json='{"pip": ["python-multipart==0.0.6"]}'
-
-# (On your local machine) Submit a request to the Ray Serve application again, and you should get the correct prediction.
-python3 mobilenet_req.py
-# [Example output]
-# {"prediction": ["n02123159", "tiger_cat", 0.2994779646396637]}
-```
-
-# Step 9: Create a RayService YAML file
-
-In the previous steps, we found that the Ray Serve application can be successfully launched using the Ray image `rayproject/ray-ml:${RAY_VERSION}` and the [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments) `python-multipart==0.0.6`.
-Therefore, we can create a RayService YAML file with the same Ray image and runtime environment.
-For more details, please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md).
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/rayserve-dev-doc.html#kuberay-dev-serve).
diff --git a/docs/guidance/rayservice-troubleshooting.md b/docs/guidance/rayservice-troubleshooting.md
diff --git a/docs/guidance/rayservice.md b/docs/guidance/rayservice.md
diff --git a/docs/guidance/stable-diffusion-rayservice.md b/docs/guidance/stable-diffusion-rayservice.md
@@ -1,64 +1 @@
-# Serve a StableDiffusion text-to-image model using RayService
-
-> **Note:** The Python files for the Ray Serve application and its client are in the [ray-project/serve_config_examples](https://github.com/ray-project/serve_config_examples) repo 
-and [the Ray documentation](https://docs.ray.io/en/latest/serve/tutorials/stable-diffusion.html).
-
-## Step 1: Create a Kubernetes cluster with GPUs
-
-Follow [aws-eks-gpu-cluster.md](./aws-eks-gpu-cluster.md) or [gcp-gke-gpu-cluster.md](./gcp-gke-gpu-cluster.md) to create a Kubernetes cluster with 1 CPU node and 1 GPU node.
-
-## Step 2: Install KubeRay operator
-
-Follow [this document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-Please note that the YAML file in this example uses `serveConfigV2`, which is supported starting from KubeRay v0.6.0.
-
-## Step 3: Install a RayService
-
-```sh
-# path: ray-operator/config/samples/
-kubectl apply -f ray-service.stable-diffusion.yaml
-```
-
-This RayService configuration contains some important settings:
-
-* The `tolerations` for workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we set `nvidia.com/gpu: 1` in the Pod's resource configurations.
-    ```yaml
-    # Please add the following taints to the GPU node.
-    tolerations:
-        - key: "ray.io/node-type"
-        operator: "Equal"
-        value: "worker"
-        effect: "NoSchedule"
-    ```
-* It includes `diffusers` in `runtime_env` since this package is not included by default in the `ray-ml` image.
-
-## Step 4: Forward the port of Serve
-
-First get the service name from this command.
-
-```sh
-kubectl get services
-```
-
-Then, port forward to the serve.
-
-```sh
-kubectl port-forward svc/stable-diffusion-serve-svc 8000
-```
-
-Note that the RayService's Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
-
-## Step 5: Send a request to the text-to-image model
-
-```sh
-# Step 5.1: Download `stable_diffusion_req.py` 
-curl -LO https://raw.githubusercontent.com/ray-project/serve_config_examples/master/stable_diffusion/stable_diffusion_req.py
-
-# Step 5.2: Set your `prompt` in `stable_diffusion_req.py`.
-
-# Step 5.3: Send a request to the Stable Diffusion model.
-python stable_diffusion_req.py
-# Check output.png
-```
-
-![image](../images/stable_diffusion_example.png)
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/examples/stable-diffusion-rayservice.html#kuberay-stable-diffusion-rayservice-example).
diff --git a/docs/guidance/text-summarizer-rayservice.md b/docs/guidance/text-summarizer-rayservice.md
@@ -1,69 +1 @@
-# Serve a text summarizer using RayService
-
-> **Note:** The Python files for the Ray Serve application and its client are in the [ray-project/serve_config_examples](https://github.com/ray-project/serve_config_examples) repo.
-
-## Step 1: Create a Kubernetes cluster with GPUs
-
-Follow [aws-eks-gpu-cluster.md](./aws-eks-gpu-cluster.md) or [gcp-gke-gpu-cluster.md](./gcp-gke-gpu-cluster.md) to create a Kubernetes cluster with 1 CPU node and 1 GPU node.
-
-## Step 2: Install KubeRay operator
-
-Follow [this document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
-Please note that the YAML file in this example uses `serveConfigV2`, which is supported starting from KubeRay v0.6.0.
-
-## Step 3: Install a RayService
-
-```sh
-# path: ray-operator/config/samples/
-kubectl apply -f ray-service.text-sumarizer.yaml
-```
-
-This RayService configuration contains some important settings:
-
-* The `tolerations`` for workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we set `nvidia.com/gpu: 1` in the Pod's resource configurations.
-    ```yaml
-    # Please add the following taints to the GPU node.
-    tolerations:
-        - key: "ray.io/node-type"
-        operator: "Equal"
-        value: "worker"
-        effect: "NoSchedule"
-    ```
-
-## Step 4: Forward the port of Serve
-
-First get the service name from this command.
-
-```sh
-kubectl get services
-```
-
-Then, port forward to the serve.
-
-```sh
-kubectl port-forward svc/text-summarizer-serve-svc 8000
-```
-
-Note that the RayService's Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
-
-## Step 5: Send a request to the text_summarizer model
-
-```sh
-# Step 5.1: Download `text_summarizer_req.py` 
-curl -LO https://raw.githubusercontent.com/ray-project/serve_config_examples/master/text_summarizer/text_summarizer_req.py
-
-# Step 5.2: Send a request to the Summarizer model.
-python text_summarizer_req.py
-# Check printed to console
-```
-
-## Step 6: Delete your service
-
-```sh
-# path: ray-operator/config/samples/
-kubectl delete -f ray-service.text-sumarizer.yaml
-```
-
-## Step 7: Uninstall your kuberay operator
-
-Follow [this document](../../helm-chart/kuberay-operator/README.md) to uninstall the latest stable KubeRay operator via Helm repository.
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/examples/text-summarizer-rayservice.html).
diff --git a/docs/guidance/tls.md b/docs/guidance/tls.md
@@ -1,177 +1 @@
-# TLS Authentication
-
-Ray can be configured to use TLS on its gRPC channels. This means that
-connecting to the Ray head will require an appropriate
-set of credentials and also that data exchanged between various processes
-(client, head, workers) will be encrypted ([Ray's document](https://docs.ray.io/en/latest/ray-core/configure.html?highlight=tls#tls-authentication)).
-
-This document provides detailed instructions for generating a public-private
-key pair and CA certificate for configuring KubeRay.
-
-> Warning: Enabling TLS will cause a performance hit due to the extra
-overhead of mutual authentication and encryption. Testing has shown that
-this overhead is large for small workloads and becomes relatively smaller
-for large workloads. The exact overhead will depend on the nature of your
-workload.
-
-# Prerequisites
-
-To fully understand this document, it's highly recommended that you have a
-solid understanding of the following concepts:
-
-* private/public key
-* CA (certificate authority)
-* CSR (certificate signing request)
-* self-signed certificate
-
-This [YouTube video](https://youtu.be/T4Df5_cojAs) is a good start.
-
-# TL;DR
-
-> Please note that this document is designed to support KubeRay version 0.5.0 or later. If you are using an older version of KubeRay, some of the instructions or configurations may not apply or may require additional modifications.
-
-> Warning: Please note that the `ray-cluster.tls.yaml` file is intended for demo purposes only. It is crucial that you **do not** store
-your CA private key in a Kubernetes Secret in your production environment.
-
-```sh
-# Install v0.6.0 KubeRay operator
-# `ray-cluster.tls.yaml` will cover from Step 1 to Step 3 (path: kuberay/)
-kubectl apply -f ray-operator/config/samples/ray-cluster.tls.yaml
-
-# Jump to Step 4 "Verify TLS authentication" to verify the connection.
-```
-
-`ray-cluster.tls.yaml` will create:
-
-* A Kubernetes Secret containing the CA's private key (`ca.key`) and self-signed certificate (`ca.crt`) (**Step 1**)
-* A Kubernetes ConfigMap containing the scripts `gencert_head.sh` and `gencert_worker.sh`, which allow Ray Pods to generate private keys
-(`tls.key`) and self-signed certificates (`tls.crt`) (**Step 2**)
-* A RayCluster with proper TLS environment variables configurations (**Step 3**)
-
-The certificate (`tls.crt`) for a Ray Pod is encrypted using the CA's private key (`ca.key`). Additionally, all Ray Pods have the CA's public key included in `ca.crt`, which allows them to decrypt certificates from other Ray Pods.
-
-# Step 1: Generate a private key and self-signed certificate for CA
-
-In this document, a self-signed certificate is used, but users also have the
-option to choose a publicly trusted certificate authority (CA) for their TLS
-authentication.
-
-```sh
-# Step 1-1: Generate a self-signed certificate and a new private key file for CA.
-openssl req -x509 \
-            -sha256 -days 3650 \
-            -nodes \
-            -newkey rsa:2048 \
-            -subj "/CN=*.kuberay.com/C=US/L=San Francisco" \
-            -keyout ca.key -out ca.crt
-
-# Step 1-2: Check the CA's public key from the self-signed certificate.
-openssl x509 -in ca.crt -noout -text
-
-# Step 1-3
-# Method 1: Use `cat $FILENAME | base64` to encode `ca.key` and `ca.crt`.
-#           Then, paste the encoding strings to the Kubernetes Secret in `ray-cluster.tls.yaml`.
-
-# Method 2: Use kubectl to encode the certifcate as Kubernetes Secret automatically.
-#           (Note: You should comment out the Kubernetes Secret in `ray-cluster.tls.yaml`.)
-kubectl create secret generic ca-tls --from-file=ca.key --from-file=ca.crt
-```
-
-* `ca.key`: CA's private key
-* `ca.crt`: CA's self-signed certificate
-
-This step is optional because the `ca.key` and `ca.crt` files have
-already been included in the Kubernetes Secret specified in [ray-cluster.tls.yaml](../../ray-operator/config/samples/ray-cluster.tls.yaml).
-
-# Step 2: Create separate private key and self-signed certificate for Ray Pods
-
-In [ray-cluster.tls.yaml](../../ray-operator/config/samples/ray-cluster.tls.yaml), each Ray
-Pod (both head and workers) generates its own private key file (`tls.key`) and self-signed
-certificate file (`tls.crt`) in its init container. We generate separate files for each Pod
-because worker Pods do not have deterministic DNS names, and we cannot use the same
-certificate across different Pods.
-
-In the YAML file, you'll find a ConfigMap named `tls` that contains two shell scripts:
-`gencert_head.sh` and `gencert_worker.sh`. These scripts are used to generate the private key
-and self-signed certificate files (`tls.key` and `tls.crt`) for the Ray head and worker Pods.
-An alternative approach for users is to prebake the shell scripts directly into the docker image that's utilized
-by the init containers, rather than relying on a ConfigMap.
-
-Please find below a brief explanation of what happens in each of these scripts:
-1. A 2048-bit RSA private key is generated and saved as `/etc/ray/tls/tls.key`.
-2. A Certificate Signing Request (CSR) is generated using the private key file (`tls.key`)
-and the `csr.conf` configuration file.
-3. A self-signed certificate (`tls.crt`) is generated using the private key of the
-Certificate Authority (`ca.key`) and the previously generated CSR.
-
-The only difference between `gencert_head.sh` and `gencert_worker.sh` is the `[ alt_names ]`
-section in `csr.conf` and `cert.conf`. The worker Pods use the fully qualified domain name
-(FQDN) of the head Kubernetes Service to establish a connection with the head Pod.
-Therefore, the `[alt_names]` section for the head Pod needs to include the FQDN of the head
-Kubernetes Service. By the way, the head Pod uses `$POD_IP` to communicate with worker Pods.
-
-```sh
-# gencert_head.sh
-[alt_names]
-DNS.1 = localhost
-DNS.2 = $FQ_RAY_IP
-IP.1 = 127.0.0.1
-IP.2 = $POD_IP
-
-# gencert_worker.sh
-[alt_names]
-DNS.1 = localhost
-IP.1 = 127.0.0.1
-IP.2 = $POD_IP
-```
-
-In [Kubernetes networking model](https://github.com/kubernetes/design-proposals-archive/blob/main/network/networking.md#pod-to-pod), the IP that a Pod sees itself as is the same IP that others see it as. That's why Ray Pods can self-register for the certificates.
-
-# Step 3: Configure environment variables for Ray TLS authentication
-
-To enable TLS authentication in your Ray cluster, set the following environment variables:
-
-- `RAY_USE_TLS`: Either 1 or 0 to use/not-use TLS. If this is set to 1 then all of the environment variables below must be set. Default: 0.
-- `RAY_TLS_SERVER_CERT`: Location of a certificate file which is presented to other endpoints so as to achieve mutual authentication (i.e. `tls.crt`).
-- `RAY_TLS_SERVER_KEY`: Location of a private key file which is the cryptographic means to prove to other endpoints that you are the authorized user of a given certificate (i.e. `tls.key`).
-- `RAY_TLS_CA_CERT`: Location of a CA certificate file which allows TLS to decide whether an endpoint’s certificate has been signed by the correct authority (i.e. `ca.crt`).
-
-For more information on how to configure Ray with TLS authentication, please refer to [Ray's document](https://docs.ray.io/en/latest/ray-core/configure.html#tls-authentication).
-
-# Step 4: Verify TLS authentication
-
-```sh
-# Log in to the worker Pod
-kubectl exec -it ${WORKER_POD} -- bash
-
-# Since the head Pod has the certificate of $FQ_RAY_IP, the connection to the worker Pods
-# will be established successfully, and the exit code of the ray health-check command
-# should be 0.
-ray health-check --address $FQ_RAY_IP:6379
-echo $? # 0
-
-# Since the head Pod has the certificate of $RAY_IP, the connection will fail and an error
-# message similar to the following will be displayed: "Peer name raycluster-tls-head-svc is
-# not in peer certificate".
-ray health-check --address $RAY_IP:6379
-
-# If you add `DNS.3 = $RAY_IP` to the [alt_names] section in `gencert_head.sh`,
-# the head Pod will generate the certificate of $RAY_IP.
-#
-# For KubeRay versions prior to 0.5.0, this step is necessary because Ray workers in earlier
-# versions use $RAY_IP to connect with Ray head.
-```
-
-# Step 5: Connect to the cluster with Ray client using TLS for interactive development
-To learn more, please check [interactive development](https://docs.ray.io/en/latest/cluster/running-applications/job-submission/ray-client.html#ray-client-interactive-development) and [TLS authentication](https://docs.ray.io/en/latest/ray-core/configure.html?highlight=tls#tls-authentication) for more detail.
-
-For instructions on connecting the Ray cluster from a Pod:
-```
-# Create a client pod and connect to cluster
-kubectl apply -f ray-operator/config/samples/ray-pod.tls.yaml
-kubectl logs ray-client-tls
-```
-Verify the output similar to:
-```
-{'CPU': 2.0, 'node:10.254.20.20': 1.0, 'object_store_memory': 771128524.0, 'memory': 3000000000.0, 'node:10.254.16.25': 1.0}
-```
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/user-guides/tls.html#kuberay-tls).
diff --git a/docs/guidance/volcano-integration.md b/docs/guidance/volcano-integration.md
@@ -1,324 +1 @@
-# KubeRay integration with Volcano
-
-[Volcano](https://github.com/volcano-sh/volcano) is a batch scheduling system built on Kubernetes. It provides a suite of mechanisms (gang scheduling, job queues, fair scheduling policies) currently missing from Kubernetes that are commonly required by many classes of batch and elastic workloads. KubeRay's Volcano integration enables more efficient scheduling of Ray pods in multi-tenant Kubernetes environments.
-
-Note that this is a new feature. Feedback and contributions welcome.
-
-## Setup
-
-### Step 1: Create a Kubernetes cluster with KinD
-```shell
-kind create cluster
-```
-
-### Step 2: Install Volcano
-
-Volcano needs to be successfully installed in your Kubernetes cluster before enabling Volcano integration with KubeRay.
-Refer to the [Quick Start Guide](https://github.com/volcano-sh/volcano#quick-start-guide) for Volcano installation instructions.
-
-### Step 3: Install KubeRay Operator with Batch Scheduling
-
-Deploy the KubeRay Operator with the `--enable-batch-scheduler` flag to enable Volcano batch scheduling support.
-
-When installing KubeRay Operator via Helm, you should either set `batchScheduler.enabled` to `true` in your
-[`values.yaml`](https://github.com/ray-project/kuberay/blob/753dc05dbed5f6fe61db3a43b34a1b350f26324c/helm-chart/kuberay-operator/values.yaml#L48)
-file:
-```shell
-# values.yaml file
-batchScheduler:
-    enabled: true
-```
-
-**or** pass `--set batchScheduler.enabled=true` flag when running on the command line:
-```shell
-# Install Helm chart with --enable-batch-scheduler flag set to true 
-helm install kuberay-operator kuberay/kuberay-operator --version ${KUBERAY_VERSION} --set batchScheduler.enabled=true
-```
-
-Follow the [KubeRay installation documentation](https://github.com/ray-project/kuberay/blob/master/helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator.
-
-### Step 4: Install a RayCluster with Volcano scheduler
-
-RayCluster custom resource must include label `ray.io/scheduler-name: volcano` to submit the cluster pods to Volcano for scheduling.
-
-```shell
-# Path: kuberay/ray-operator/config/samples
-# Includes label `ray.io/scheduler-name: volcano` in the metadata.labels
-kubectl apply -f ray-cluster.volcano-scheduler.yaml
-
-# Check RayCluster
-kubectl get pod -l ray.io/cluster=test-cluster-0
-# NAME                                 READY   STATUS    RESTARTS   AGE
-# test-cluster-0-head-jj9bg            1/1     Running   0          36s
-```
-
-In addition, the following labels can also be provided in the RayCluster metadata:
-
-- `ray.io/priority-class-name`: the cluster priority class as defined by Kubernetes [here](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass).
-  - This label will only work after the creation of a `PriorityClass` resource
-  - ```shell
-    labels:
-      ray.io/scheduler-name: volcano
-      ray.io/priority-class-name: <replace with correct PriorityClass resource name>
-    ```
-- `volcano.sh/queue-name`: the Volcano [queue](https://volcano.sh/en/docs/queue/) name the cluster will be submitted to.
-  - This label will only work after the creation of a `Queue` resource
-  - ```shell
-    labels:
-      ray.io/scheduler-name: volcano
-      volcano.sh/queue-name: <replace with correct Queue resource name>
-    ```
-
-If autoscaling is enabled, `minReplicas` will be used for gang scheduling, otherwise the desired `replicas` will be used.
-
-### Step 5: Use Volcano for batch scheduling
-
-If you need some guidance, check out [examples](https://github.com/volcano-sh/volcano/tree/master/example) available.
-
-## Example
-
-Before going through the example, remove any ray clusters running to ensure successful run through of the example below. 
-```shell
-kubectl delete raycluster --all
-```
-
-### Gang scheduling
-
-In this example, we'll walk through how gang scheduling works with Volcano and KubeRay.
-
-First, let's create a queue with a capacity of 4 CPUs and 6Gi of RAM:
-
-```shell
-kubectl create -f - <<EOF
-apiVersion: scheduling.volcano.sh/v1beta1
-kind: Queue
-metadata:
-  name: kuberay-test-queue
-spec:
-  weight: 1
-  capability:
-    cpu: 4
-    memory: 6Gi
-EOF
-```
-
-The **weight** in the definition above indicates the relative weight of a queue in cluster resource division. This is useful in cases where the total **capability** of all the queues in your cluster exceeds the total available resources, forcing the queues to share among themselves. Queues with higher weight will be allocated a proportionally larger share of the total resources.
-
-The **capability** is a hard constraint on the maximum resources the queue will support at any given time. It can be updated as needed to allow more or fewer workloads to run at a time.
-
-Next we'll create a RayCluster with a head node (1 CPU + 2Gi of RAM) and two workers (1 CPU + 1Gi of RAM each), for a total of 3 CPU and 4Gi of RAM:
-
-```shell
-# Path: kuberay/ray-operator/config/samples
-# Includes labels `ray.io/scheduler-name: volcano` and `volcano.sh/queue-name: kuberay-test-queue` in the metadata.labels
-kubectl apply -f ray-cluster.volcano-scheduler-queue.yaml
-```
-
-Because our queue has a capacity of 4 CPU and 6Gi of RAM, this resource should schedule successfully without any issues. We can verify this by checking the status of our cluster's Volcano PodGroup to see that the phase is `Running` and the last status is `Scheduled`:
-
-```shell
-kubectl get podgroup ray-test-cluster-0-pg -o yaml
-
-# apiVersion: scheduling.volcano.sh/v1beta1
-# kind: PodGroup
-# metadata:
-#   creationTimestamp: "2022-12-01T04:43:30Z"
-#   generation: 2
-#   name: ray-test-cluster-0-pg
-#   namespace: test
-#   ownerReferences:
-#   - apiVersion: ray.io/v1alpha1
-#     blockOwnerDeletion: true
-#     controller: true
-#     kind: RayCluster
-#     name: test-cluster-0
-#     uid: 7979b169-f0b0-42b7-8031-daef522d25cf
-#   resourceVersion: "4427347"
-#   uid: 78902d3d-b490-47eb-ba12-d6f8b721a579
-# spec:
-#   minMember: 3
-#   minResources:
-#     cpu: "3"
-#     memory: 4Gi
-#   queue: kuberay-test-queue
-# status:
-#   conditions:
-#   - lastTransitionTime: "2022-12-01T04:43:31Z"
-#     reason: tasks in gang are ready to be scheduled
-#     status: "True"
-#     transitionID: f89f3062-ebd7-486b-8763-18ccdba1d585
-#     type: Scheduled
-#   phase: Running
-```
-
-And checking the status of our queue to see that we have 1 running job:
-
-```shell
-kubectl get queue kuberay-test-queue -o yaml
-
-# apiVersion: scheduling.volcano.sh/v1beta1
-# kind: Queue
-# metadata:
-#   creationTimestamp: "2022-12-01T04:43:21Z"
-#   generation: 1
-#   name: kuberay-test-queue
-#   resourceVersion: "4427348"
-#   uid: a6c4f9df-d58c-4da8-8a58-e01c93eca45a
-# spec:
-#   capability:
-#     cpu: 4
-#     memory: 6Gi
-#   reclaimable: true
-#   weight: 1
-# status:
-#   reservation: {}
-#   running: 1
-#   state: Open
-```
-
-Next, we'll add an additional RayCluster with the same configuration of head / worker nodes, but a different name:
-
-```shell
-# Path: kuberay/ray-operator/config/samples
-# Includes labels `ray.io/scheduler-name: volcano` and `volcano.sh/queue-name: kuberay-test-queue` in the metadata.labels
-# Replaces the name to test-cluster-1
-sed 's/test-cluster-0/test-cluster-1/' ray-cluster.volcano-scheduler-queue.yaml | kubectl apply -f-
-```
-
-Now check the status of its PodGroup to see that its phase is `Pending` and the last status is `Unschedulable`:
-
-```shell
-kubectl get podgroup ray-test-cluster-1-pg -o yaml
-
-# apiVersion: scheduling.volcano.sh/v1beta1
-# kind: PodGroup
-# metadata:
-#   creationTimestamp: "2022-12-01T04:48:18Z"
-#   generation: 2
-#   name: ray-test-cluster-1-pg
-#   namespace: test
-#   ownerReferences:
-#   - apiVersion: ray.io/v1alpha1
-#     blockOwnerDeletion: true
-#     controller: true
-#     kind: RayCluster
-#     name: test-cluster-1
-#     uid: b3cf83dc-ef3a-4bb1-9c42-7d2a39c53358
-#   resourceVersion: "4427976"
-#   uid: 9087dd08-8f48-4592-a62e-21e9345b0872
-# spec:
-#   minMember: 3
-#   minResources:
-#     cpu: "3"
-#     memory: 4Gi
-#   queue: kuberay-test-queue
-# status:
-#   conditions:
-#   - lastTransitionTime: "2022-12-01T04:48:19Z"
-#     message: '3/3 tasks in gang unschedulable: pod group is not ready, 3 Pending,
-#       3 minAvailable; Pending: 3 Undetermined'
-#     reason: NotEnoughResources
-#     status: "True"
-#     transitionID: 3956b64f-fc52-4779-831e-d379648eecfc
-#     type: Unschedulable
-#   phase: Pending
-```
-
-Because our new cluster requires more CPU and RAM than our queue will allow, even though we could fit one of the pods with the remaining 1 CPU and 2Gi of RAM, none of the cluster's pods will be placed until there is enough room for all the pods. Without using Volcano for gang scheduling in this way, one of the pods would ordinarily be placed, leading to the cluster being partially allocated, and some jobs (like [Horovod](https://github.com/horovod/horovod) training) getting stuck waiting for resources to become available.
-
-We can see the effect this has on scheduling the pods for our new RayCluster, which are listed as `Pending`:
-
-```shell
-kubectl get pods
-
-# NAME                                            READY   STATUS         RESTARTS   AGE
-# test-cluster-0-worker-worker-ddfbz              1/1     Running        0          7m
-# test-cluster-0-head-vst5j                       1/1     Running        0          7m
-# test-cluster-0-worker-worker-57pc7              1/1     Running        0          6m59s
-# test-cluster-1-worker-worker-6tzf7              0/1     Pending        0          2m12s
-# test-cluster-1-head-6668q                       0/1     Pending        0          2m12s
-# test-cluster-1-worker-worker-n5g8k              0/1     Pending        0          2m12s
-```
-
-If we dig into the pod details, we'll see that this is indeed because Volcano cannot schedule the gang:
-
-```shell
-kubectl describe pod test-cluster-1-head-6668q | tail -n 3
-
-# Type     Reason            Age   From     Message
-# ----     ------            ----  ----     -------
-# Warning  FailedScheduling  4m5s  volcano  3/3 tasks in gang unschedulable: pod group is not ready, 3 Pending, 3 minAvailable; Pending: 3 Undetermined
-```
-
-Let's go ahead and delete the first RayCluster to clear up space in the queue:
-
-```shell
-kubectl delete raycluster test-cluster-0
-```
-
-The PodGroup for the second cluster has moved to the `Running` state, as there are now enough resources available to schedule the entire set of pods:
-
-```shell
-kubectl get podgroup ray-test-cluster-1-pg -o yaml
-
-# apiVersion: scheduling.volcano.sh/v1beta1
-# kind: PodGroup
-# metadata:
-#   creationTimestamp: "2022-12-01T04:48:18Z"
-#   generation: 9
-#   name: ray-test-cluster-1-pg
-#   namespace: test
-#   ownerReferences:
-#   - apiVersion: ray.io/v1alpha1
-#     blockOwnerDeletion: true
-#     controller: true
-#     kind: RayCluster
-#     name: test-cluster-1
-#     uid: b3cf83dc-ef3a-4bb1-9c42-7d2a39c53358
-#   resourceVersion: "4428864"
-#   uid: 9087dd08-8f48-4592-a62e-21e9345b0872
-# spec:
-#   minMember: 3
-#   minResources:
-#     cpu: "3"
-#     memory: 4Gi
-#   queue: kuberay-test-queue
-# status:
-#   conditions:
-#   - lastTransitionTime: "2022-12-01T04:54:04Z"
-#     message: '3/3 tasks in gang unschedulable: pod group is not ready, 3 Pending,
-#       3 minAvailable; Pending: 3 Undetermined'
-#     reason: NotEnoughResources
-#     status: "True"
-#     transitionID: db90bbf0-6845-441b-8992-d0e85f78db77
-#     type: Unschedulable
-#   - lastTransitionTime: "2022-12-01T04:55:10Z"
-#     reason: tasks in gang are ready to be scheduled
-#     status: "True"
-#     transitionID: 72bbf1b3-d501-4528-a59d-479504f3eaf5
-#     type: Scheduled
-#   phase: Running
-#   running: 3
-```
-
-Checking the pods again, we see that the second cluster is now up and running:
-
-```shell
-kubectl get pods
-
-# NAME                                            READY   STATUS         RESTARTS   AGE
-# test-cluster-1-worker-worker-n5g8k              1/1     Running        0          9m4s
-# test-cluster-1-head-6668q                       1/1     Running        0          9m4s
-# test-cluster-1-worker-worker-6tzf7              1/1     Running        0          9m4s
-```
-
-Finally, we'll clean up the remaining cluster and queue:
-
-```shell
-kubectl delete raycluster test-cluster-1
-kubectl delete queue kuberay-test-queue
-```
-
-## Questions
-
-Reach out to @tgaddair for questions regarding usage of this integration.
+This document has been moved to the [Ray documentation](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/volcano.html).