Skip to content

Commit

Permalink
add doc
Browse files Browse the repository at this point in the history
Signed-off-by: Yicheng-Lu-llll <[email protected]>
  • Loading branch information
Yicheng-Lu-llll committed May 12, 2023
1 parent b0649c4 commit 4e87ca3
Show file tree
Hide file tree
Showing 16 changed files with 159 additions and 27 deletions.
33 changes: 33 additions & 0 deletions docs/guidance/rayStartParams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

## Default Ray Start Parameters for Kuberay

This document outlines the default settings for `rayStartParams` in Kuberay.

### Options Exclusive to the Head Pod

- `--redis-password`: Redis password for an external Redis, necessary when [fault tolerance](https://github.com/ray-project/kuberay/blob/master/docs/guidance/gcs-ft.md) is enabled. No default value. Check out this [example](https://github.com/kevin85421/kuberay/blob/master/ray-operator/config/samples/ray-cluster.external-redis.yaml).

- `--port`: Port for the head Ray process (GCS server). Default is `6379`. Please ensure this value matches the `gcs-server` container port.

- `--dashboard-host`: Host for the dashboard server, either `localhost` (127.0.0.1) or `0.0.0.0` (all interfaces). No default value.

- `--no-monitor`: This option disables the monitor and autoscaler in the **user's container**. It will be automatically set when [autoscaling](https://github.com/ray-project/kuberay/blob/master/docs/guidance/autoscaler.md)(which introduces the autoscaler as a sidecar container within the head pod) is enabled. See [PR #13505](https://github.com/ray-project/ray/pull/13505) for more details. Modification is not recommended.

### Options Exclusive to the worker Pods

- `--address`: Address of the GCS server. Default is `<FQDN>:<GCS_PORT>`. Check [PR #938](https://github.com/ray-project/kuberay/pull/938) and [PR #951](https://github.com/ray-project/kuberay/pull/951) for FQDN details. The `GCS_PORT` is the same as the value in `--port` option. Worker pods use this address to connect to the Ray cluster.

- `--address`: Address of the GCS server. Worker pods utilize this address to establish a connection with the Ray cluster. By default, this address takes the form `<FQDN>:<GCS_PORT>`. The `GCS_PORT` corresponds to the value set in the `--port` option. For more insights on Fully Qualified Domain Name (FQDN), kindly refer to [PR #938](https://github.com/ray-project/kuberay/pull/938) and [PR #951](https://github.com/ray-project/kuberay/pull/951).

### Options Applicable to Both Head and Worker Pods

- `--metrics-export-port`: Port for exposing Ray metrics through a Prometheus endpoint. Default is `8080`.

- `--num-cpus`: Number of logical CPUs on the pod. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170).

- `--memory`: Amount of memory on the pod. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170).

- `--num-gpus`: Number of GPUs on the pod. Default is determined by Ray container resource limits. Modify Ray container resource limits instead of this option. See [PR #170](https://github.com/ray-project/kuberay/pull/170).

- `--block`: This option blocks the ray start command indefinitely. It will be automatically set for performance reasons. See [PR #675](https://github.com/ray-project/kuberay/pull/675) for more details. Modification is not recommended.

10 changes: 6 additions & 4 deletions ray-operator/config/samples/ray-cluster.autoscaler.large.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,11 @@ spec:
memory: "512Mi"
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
# Flag "no-monitor" will be automatically set when autoscaling is enabled.
dashboard-host: '0.0.0.0'
# num-cpus: '14' # can be auto-completed from the limits
# Use `resources` to optionally specify custom resource annotations for the Ray node.
# The value of `resources` is a string-integer mapping.
# Currently, `resources` must be provided in the specific format demonstrated below:
Expand Down Expand Up @@ -112,7 +112,9 @@ spec:
# - raycluster-complete-worker-large-group-bdtwh
# - raycluster-complete-worker-large-group-hv457
# - raycluster-complete-worker-large-group-k8tj7
# the following params are used to complete the ray start: ray start --block --node-ip-address= ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
9 changes: 6 additions & 3 deletions ray-operator/config/samples/ray-cluster.autoscaler.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,11 @@ spec:
memory: "512Mi"
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# num-cpus: '1' # can be auto-completed from the limits
# Use `resources` to optionally specify custom resource annotations for the Ray node.
# The value of `resources` is a string-integer mapping.
# Currently, `resources` must be provided in the specific format demonstrated below:
Expand Down Expand Up @@ -112,7 +113,9 @@ spec:
# - raycluster-complete-worker-small-group-bdtwh
# - raycluster-complete-worker-small-group-hv457
# - raycluster-complete-worker-small-group-k8tj7
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray-cluster.complete.large.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ spec:
# for the head group, replicas should always be 1.
# headGroupSpec.replicas is deprecated in KubeRay >= 0.3.0.
replicas: 1
# the following params are used to complete the ray start: ray start --head --block --dashboard-host: '0.0.0.0' ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -80,7 +82,9 @@ spec:
# - raycluster-complete-worker-large-group-bdtwh
# - raycluster-complete-worker-large-group-hv457
# - raycluster-complete-worker-large-group-k8tj7
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray-cluster.complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ spec:
# Kubernetes Service Type. This is an optional field, and the default value is ClusterIP.
# Refer to https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types.
serviceType: ClusterIP
# the following params are used to complete the ray start: ray start --head --block --dashboard-host: '0.0.0.0' ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -80,7 +82,9 @@ spec:
# - raycluster-complete-worker-small-group-bdtwh
# - raycluster-complete-worker-small-group-hv457
# - raycluster-complete-worker-small-group-k8tj7
# the following params are used to complete the ray start: ray start --block
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
12 changes: 11 additions & 1 deletion ray-operator/config/samples/ray-cluster.external-redis.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,11 @@ spec:
rayVersion: '2.4.0'
headGroupSpec:
replicas: 1
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: "0.0.0.0"
num-cpus: "1" # can be auto-completed from the limits
# redis-password should match "requirepass" in redis.conf in the ConfigMap above.
# Ray 2.3.0 changes the default redis password from "5241590000000000" to "".
redis-password: $REDIS_PASSWORD
Expand All @@ -102,6 +104,11 @@ spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: "1"
requests:
cpu: "200m"
env:
# RAY_REDIS_ADDRESS can force ray to use external redis
- name: RAY_REDIS_ADDRESS
Expand Down Expand Up @@ -131,6 +138,9 @@ spec:
maxReplicas: 2
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
5 changes: 3 additions & 2 deletions ray-operator/config/samples/ray-cluster.head-command.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down
18 changes: 14 additions & 4 deletions ray-operator/config/samples/ray-cluster.heterogeneous.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,22 @@ spec:
######################headGroupSpecs#################################
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from Ray container resource limits
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: "1"
requests:
cpu: "200m"
volumeMounts:
- mountPath: /opt
name: config
Expand All @@ -72,7 +78,9 @@ spec:
maxReplicas: 10
# logical group name, for this called small-group, also can be functional
groupName: small-group
# the following params are used to complete the ray start: ray start --block ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down Expand Up @@ -106,7 +114,9 @@ spec:
# workersToDelete:
#- raycluster-heterogeneous-worker-medium-group-7bv5h
# - worker-4k2ih
# the following params are used to complete the ray start: ray start --block --node-ip-address= ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
5 changes: 3 additions & 2 deletions ray-operator/config/samples/ray-cluster.mini.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down
10 changes: 8 additions & 2 deletions ray-operator/config/samples/ray-cluster.separate-ingress.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,22 @@ spec:
headGroupSpec:
serviceType: NodePort
replicas: 1
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
port: '6379'
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.4.0
resources:
limits:
cpu: 1
requests:
cpu: "200m"
ports:
- containerPort: 6379
name: gcs-server
Expand Down
6 changes: 6 additions & 0 deletions ray-operator/config/samples/ray-cluster.tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ spec:
rayVersion: '2.4.0'
# Ray head pod configuration
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
Expand Down Expand Up @@ -96,6 +99,9 @@ spec:
minReplicas: 1
maxReplicas: 10
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
6 changes: 6 additions & 0 deletions ray-operator/config/samples/ray-service.autoscaler.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ spec:
memory: "1000Mi"
######################headGroupSpecs#################################
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {"num-cpus": "0"}
#pod template
template:
Expand Down Expand Up @@ -86,6 +89,9 @@ spec:
maxReplicas: 5
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
8 changes: 6 additions & 2 deletions ray-operator/config/samples/ray_v1alpha1_rayjob.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@ spec:
rayVersion: '2.4.0' # should match the Ray version in the image of the containers
# Ray head pod template
headGroupSpec:
# the following params are used to complete the ray start: ray start --head --block --redis-port=6379 ...
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
num-cpus: '1' # can be auto-completed from the limits
#pod template
template:
spec:
Expand Down Expand Up @@ -62,6 +63,9 @@ spec:
maxReplicas: 5
# logical group name, for this called small-group, also can be functional
groupName: small-group
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in kuberay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
Expand Down
Loading

0 comments on commit 4e87ca3

Please sign in to comment.