Skip to content

Commit

Permalink
revise rayservice doc (ray-project#607)
Browse files Browse the repository at this point in the history
Copy-edits the guidance outlining RayService for clarity and consistency.

Signed-off-by: Rafael Vasquez <[email protected]>
  • Loading branch information
rafvasq authored Oct 6, 2022
1 parent c8cf801 commit c074ef7
Showing 1 changed file with 69 additions and 72 deletions.
141 changes: 69 additions & 72 deletions docs/guidance/rayservice.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,24 @@

### What is a RayService?

The RayService is a new custom resource (CR) supported by KubeRay in v0.3.0.
RayService is a new custom resource (CR) supported by KubeRay in v0.3.0.

A RayService manages 2 things:
* RayCluster: Manages resources in kubernetes cluster.
* Ray Serve Deployment Graph: Manages users' serve deployment graph.
* Ray Cluster: Manages resources in a Kubernetes cluster.
* Ray Serve Deployment Graph: Manages users' deployment graphs.

### What does the RayService provide?

* Kubernetes-native support for Ray cluster and Ray Serve deployment graphs. You can use a kubernetes config to define a ray cluster and its ray serve deployment graphs. Then you can use `kubectl` to create the cluster and its graphs.
* In-place update for ray serve deployment graph. Users can update the ray serve deployment graph config in the RayService CR config and use `kubectl apply` to update the serve deployment graph.
* Zero downtime upgrade for ray cluster. Users can update the ray cluster config in the RayService CR config and use `kubectl apply` to update the ray cluster. RayService will temporarily create a pending ray cluster, wait for the pending ray cluster ready, and then switch traffics to the new ray cluster, terminate the old cluster.
* Services HA. RayService will monitor the ray cluster and serve deployments health status. If RayService detects any unhealthy status lasting for a certain time, RayService will try to create a new ray cluster, and switch traffic to the new cluster when it is ready.
* **Kubernetes-native support for Ray clusters and Ray Serve deployment graphs.** After using a Kubernetes config to define a Ray cluster and its Ray Serve deployment graphs, you can use `kubectl` to create the cluster and its graphs.
* **In-place update for Ray Serve deployment graph.** Users can update the Ray Serve deployment graph config in the RayService CR config and use `kubectl apply` to update the deployment graph.
* **Zero downtime upgrade for Ray clusters.** Users can update the Ray cluster config in the RayService CR config and use `kubectl apply` to update the cluster. RayService will temporarily create a pending cluster and wait for it to be ready, then switch traffic to the new cluster and terminate the old one.
* **Services HA.** RayService will monitor the Ray cluster and Serve deployments' health statuses. If RayService detects an unhealthy status for a period of time, RayService will try to create a new Ray cluster and switch traffic to the new cluster when it is ready.

### Deploy the Operator

`$ kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s"`
```
$ kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s"
```

Check that the controller is running.

Expand All @@ -37,9 +39,9 @@ NAME READY STATUS RESTARTS AGE
ray-operator-75dbbf8587-5lrvn 1/1 Running 0 31s
```

### Run an example cluster
### Run an Example Cluster

There is one example config file to deploy RaySerive included here:
An example config file to deploy RayService is included here:
[ray_v1alpha1_rayservice.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray_v1alpha1_rayservice.yaml)

```shell
Expand Down Expand Up @@ -74,7 +76,7 @@ rayservice-sample-raycluster-qd2vl-head-svc ClusterIP 10.100.180.221
rayservice-sample-serve-svc ClusterIP 10.100.39.92 <none> 8000/TCP 24m
```

> Note: Default ports and their definition.
> Note: Default ports and their definitions.
| Port | Definition |
|-------|---------------------|
Expand All @@ -84,98 +86,93 @@ rayservice-sample-serve-svc ClusterIP 10.100.39.92
| 8000 | Ray Serve |
| 52365 | Ray Dashboard Agent |

Get the RayService information with your RayService name.
Get information about the RayService using its name.
```shell
$ kubectl describe rayservices rayservice-sample
$ kubectl describe rayservices rayservice-sample
```

### Access User Services

The users' traffic can go through the `serve` service (for example, `rayservice-sample-serve-svc`).
The users' traffic can go through the `serve` service (e.g. `rayservice-sample-serve-svc`).

#### Run a curl pod
`kubectl run curl --image=radial/busyboxplus:curl -i --tty`
Or if you already have a curl pod running, you can login with `kubectl exec -it curl sh`.
#### Run a Curl Pod

For the fruit example deployment, you can try the following request
```shell
[ root@curl:/ ]$ curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc.default.svc.cluster.local:8000 -d '["MANGO", 2]'
6
$ kubectl run curl --image=radial/busyboxplus:curl -i --tty
```
You can get the response as `6`.

Or if you already have a curl pod running, you can login using `kubectl exec -it curl sh`.

For the fruit example deployment, you can try the following request:
```shell
[ root@curl:/ ]$ curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc.default.svc.cluster.local:8000 -d '["MANGO", 2]'
> 6
```
You should get the response `6`.

#### Use Port Forwarding
Set up kubernetes port forwarding.
Set up Kubernetes port forwarding.
```shell
$ kubectl port-forward service/rayservice-sample-serve-svc 8000
```
For the fruit example deployment, you can try the following request
For the fruit example deployment, you can try the following request:
```shell
curl -X POST -H 'Content-Type: application/json' localhost:8000 -d '["MANGO", 2]'
6
[ root@curl:/ ]$ curl -X POST -H 'Content-Type: application/json' localhost:8000 -d '["MANGO", 2]'
> 6
```

`serve-svc` is HA in general.
* Note: serve-svc will do traffic routing among all the workers which have serve deployments.
* Note: serve-svc will always try it best to point to the healthy cluster, even during upgrading or failing cases.
* Note: You can set `serviceUnhealthySecondThreshold` to define the threshold of seconds that the serve deployments fail.
* Note: You can set `deploymentUnhealthySecondThreshold` to define the threshold of seconds that the Ray fails to deploy any serve deployments.
> Note:
> `serve-svc` is HA in general. It will do traffic routing among all the workers which have serve deployments and will always try to point to the healthy cluster, even during upgrading or failing cases.
> You can set `serviceUnhealthySecondThreshold` to define the threshold of seconds that the serve deployments fail. You can also set `deploymentUnhealthySecondThreshold` to define the threshold of seconds that Ray fails to deploy any serve deployments.
### Access Ray Dashboard
Set up kubernetes port forwarding for the dashboard.
Set up Kubernetes port forwarding for the dashboard.
```shell
$ kubectl port-forward service/rayservice-sample-head-svc 8265
```
Then you can open your web browser with the url localhost:8265 to see your Ray dashboard page.
Access the dashboard using a web browser at `localhost:8265`.

### Update Ray Serve Deployment Graph

You can update the `serveConfig` in your RayService config file.
For example, if you update the mango price to 4 in [ray_v1alpha1_rayservice.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray_v1alpha1_rayservice.yaml).
For example, update the price of mangos to `4` in [ray_v1alpha1_rayservice.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray_v1alpha1_rayservice.yaml):
```shell
- name: MangoStand
numReplicas: 1
userConfig: |
price: 4
- name: MangoStand
numReplicas: 1
userConfig: |
price: 4
```

Do a `kubectl apply` to update your RayService.

You can check the kubernetes stats of your RayService. It should show similar:
Use `kubectl apply` to update your RayService and `kubectl describe rayservices rayservice-sample` to take a look at the RayService's information. It should look similar to:
```shell
serveDeploymentStatuses:
- healthLastUpdateTime: "2022-07-18T21:51:37Z"
lastUpdateTime: "2022-07-18T21:51:41Z"
name: MangoStand
status: UPDATING
serveDeploymentStatuses:
- healthLastUpdateTime: "2022-07-18T21:51:37Z"
lastUpdateTime: "2022-07-18T21:51:41Z"
name: MangoStand
status: UPDATING
```

After it finishes deployment, let's send a request again.
After it finishes deployment, let's send a request again. In the curl pod from earlier, run:
```shell
# In the curl pod.
[ root@curl:/ ]$ curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc.default.svc.cluster.local:8000 -d '["MANGO", 2]'
8
[ root@curl:/ ]$ curl -X POST -H 'Content-Type: application/json' rayservice-sample-serve-svc.default.svc.cluster.local:8000 -d '["MANGO", 2]'
> 8
```
Or
Or if using port forwarding:
```shell
# Using port forwarding.
curl -X POST -H 'Content-Type: application/json' localhost:8000 -d '["MANGO", 2]'
8
curl -X POST -H 'Content-Type: application/json' localhost:8000 -d '["MANGO", 2]'
> 8
```
Now you will get `8` as a result.
You should now get `8` as a result.

### Upgrade RayService RayCluster Config
You can update the `rayClusterConfig` in your RayService config file.
For example, you can increase the worker node num to 2.
For example, you can increase the number of workers to 2:
```shell
workerGroupSpecs:
# the pod replicas in this group typed worker
- replicas: 2
```

Do a `kubectl apply` to update your RayService.

You can check the kubernetes stats of your RayService. It should show similar:
Use `kubectl apply` to update your RayService and `kubectl describe rayservices rayservice-sample` to take a look at the RayService's information. It should look similar to:
```shell
pendingServiceStatus:
appStatus: {}
Expand All @@ -185,20 +182,16 @@ You can check the kubernetes stats of your RayService. It should show similar:
rayClusterName: rayservice-sample-raycluster-bshfr
rayClusterStatus: {}
```
You can see RayService is preparing a pending cluster. After the pending cluster is healthy, RayService will switch it as active cluster and terminate the previous cluster.
You can see the RayService is preparing a pending cluster. Once the pending cluster is healthy, the RayService will make it the active cluster and terminate the previous one.

### RayService Observability
You can use `kubectl logs` to check the operator logs or the head/worker nodes logs.
You can also use `kubectl describe rayservices rayservice-sample` to check the states and event logs of your RayService instance.

For ray serve monitoring, you can refer to the [Ray observability documentation](https://docs.ray.io/en/master/ray-observability/state/state-api.html).
To run Ray state APIs, you can log in to the head pod and use the Ray CLI.
`kubectl exec -it <head-node-pod> bash`
Or you can run the command locally:
`kubectl exec -it <head-node-pod> -- <ray state api>`
For example:
`kubectl exec -it <head-node-pod> -- ray summary tasks`
Output
For Ray Serve monitoring, you can refer to the [Ray observability documentation](https://docs.ray.io/en/master/ray-observability/state/state-api.html).
To run Ray state APIs, log in to the head pod by running `kubectl exec -it <head-node-pod> bash` and use the Ray CLI or you can run commands locally using `kubectl exec -it <head-node-pod> -- <ray state api>`.

For example, `kubectl exec -it <head-node-pod> -- ray summary tasks` outputs the following:
```shell
======== Tasks Summary: 2022-07-28 15:10:24.801670 ========
Stats:
Expand All @@ -221,9 +214,13 @@ Table (group by func_name):
7 ServeController.__init__ FINISHED: 1 ACTOR_CREATION_TASK
```

### Delete the RayService instance
`$ kubectl delete -f config/samples/ray_v1alpha1_rayservice.yaml`
### Delete the RayService Instance
```
$ kubectl delete -f config/samples/ray_v1alpha1_rayservice.yaml
```

### Delete the operator
### Delete the Operator

`$ kubectl delete -k "github.com/ray-project/kuberay/ray-operator/config/default"`
```
$ kubectl delete -k "github.com/ray-project/kuberay/ray-operator/config/default"
```

0 comments on commit c074ef7

Please sign in to comment.