Skip to content

Commit

Permalink
Merge branch 'ray-project:master' into simplify_k8s_client_creation
Browse files Browse the repository at this point in the history
  • Loading branch information
chenk008 authored Mar 22, 2022
2 parents 5f39225 + ee72afc commit dd458a2
Show file tree
Hide file tree
Showing 39 changed files with 2,059 additions and 621 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,4 @@
# Dependency directories (remove the comment below to include it)
**/vendor/

.ipynb_checkpoints
74 changes: 74 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,80 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## [v0.2.0](https://github.com/ray-project/kuberay/tree/v0.2.0) (2022-03-13)

### Features

* Support envFrom in rayclusters deployed with Helm ([#183](https://github.com/ray-project/kuberay/pull/183), @ebr)
* Helm: support imagePullSecrets for ray clusters ([#182](https://github.com/ray-project/kuberay/pull/182), @ebr)
* Support scheduling constraints in Helm-deployed clusters ([#181](https://github.com/ray-project/kuberay/pull/181), @ebr)
* Helm: ensure RBAC rules are up to date with the latest autogenerated manifest ([#175](https://github.com/ray-project/kuberay/pull/175), @ebr)
* add resource command ([#170](https://github.com/ray-project/kuberay/pull/170), @zhuangzhuang131419)
* Use container to generate proto files ([#160](https://github.com/ray-project/kuberay/pull/160), @Jeffwan)
* Support in-tree autoscaler ([#163](https://github.com/ray-project/kuberay/pull/163), @Jeffwan)
* [CLI] check viper error ([#172](https://github.com/ray-project/kuberay/pull/172), @chenk008)
* [Feature]Add subcommand `--version` ([#166](https://github.com/ray-project/kuberay/pull/166), @chenk008)
* [Feature] Add flag `watch-namespace` ([#165](https://github.com/ray-project/kuberay/pull/165), @chenk008)
* Support enableIngress for RayCluster ([#38](https://github.com/ray-project/kuberay/pull/38), @Jeffwan)
* Add CRD verb permission in helm ([#144](https://github.com/ray-project/kuberay/pull/144), @chenk008)
* Add quick start deployment manifests ([#132](https://github.com/ray-project/kuberay/pull/132), @Jeffwan)
* Add CLI to kuberay ([#135](https://github.com/ray-project/kuberay/pull/135), @wolfsniper2388)
* Ray Operator: Upgrade to Go v1.17 ([#128](https://github.com/ray-project/kuberay/pull/128), @haoxins)
* Add deploy manifests for apiserver ([#119](https://github.com/ray-project/kuberay/pull/119), @Jeffwan)
* Implement resource manager and gRPC services ([#127](https://github.com/ray-project/kuberay/pull/127), @Jeffwan)
* Generate go clients and swagger files ([#126](https://github.com/ray-project/kuberay/pull/126), @Jeffwan)
* [service] Init backend service project ([#113](https://github.com/ray-project/kuberay/pull/113), @Jeffwan)
* Add gRPC service definition and gRPC gateway ([#112](https://github.com/ray-project/kuberay/pull/112), @Jeffwan)
* [proto] Add core api definitions as protobuf message ([#93](https://github.com/ray-project/kuberay/pull/93), @Jeffwan)
* Use ray start block in Pod's entrypoint ([#77](https://github.com/ray-project/kuberay/pull/77), @chenk008)
* Add generated clientsets, informers and listers ([#97](https://github.com/ray-project/kuberay/pull/97), @Jeffwan)
* Add codegen scripts and make required api changes ([#96](https://github.com/ray-project/kuberay/pull/96), @harryge00)
* Reorganize api folder for code generation ([#91](https://github.com/ray-project/kuberay/pull/91), @harryge00)

### Bug fixes

* Fix serviceaccount typo in operator role ([#188](https://github.com/ray-project/kuberay/pull/188), @Jeffwan)
* Fix cli typo ([#173](https://github.com/ray-project/kuberay/pull/173), @chenk008)
* [Bug]Leader election need lease permission ([#169](https://github.com/ray-project/kuberay/pull/169), @chenk008)
* refactor: rename kubray -> kuberay ([#145](https://github.com/ray-project/kuberay/pull/145), @tekumara)
* Fix the Helm chart's image name ([#130](https://github.com/ray-project/kuberay/pull/130), @haoxins)
* fix typo in the helm chart templates ([#129](https://github.com/ray-project/kuberay/pull/129), @haoxins)
* fix issue that modifies the list while iterating through it ([#125](https://github.com/ray-project/kuberay/pull/125), @wilsonwang371)
* Add helm ([#109](https://github.com/ray-project/kuberay/pull/109), @zhuangzhuang131419)
* Update samples yaml ([#102](https://github.com/ray-project/kuberay/pull/102), @ryantd)
* fix missing template objectmeta ([#95](https://github.com/ray-project/kuberay/pull/95), @chenk008)
* fix typo in Readme ([#81](https://github.com/ray-project/kuberay/pull/81), @denkensk)

### Testing

* kuberay compatibility test with ray ([#157](https://github.com/ray-project/kuberay/pull/157), @wilsonwang371)
* Setup ci for apiserver ([#162](https://github.com/ray-project/kuberay/pull/162), @Jeffwan)
* Enable gofmt and move goimports to linter job ([#158](https://github.com/ray-project/kuberay/pull/158), @Jeffwan)
* add more debug info for bug-150: goimport issue ([#151](https://github.com/ray-project/kuberay/pull/151), @wilsonwang371)
* add nightly docker build workflow ([#141](https://github.com/ray-project/kuberay/pull/141), @wilsonwang371)
* enable goimport and add new makefile target to only build image without test ([#123](https://github.com/ray-project/kuberay/pull/123), @wilsonwang371)
* [Feature]add docker build stage to ci workflow ([#122](https://github.com/ray-project/kuberay/pull/122), @wilsonwang371)
* Pass --timeout option to golangci-lint ([#116](https://github.com/ray-project/kuberay/pull/116), @Jeffwan)
* Add linter job for github workflow ([#79](https://github.com/ray-project/kuberay/pull/79), @feilengcui008)

### Docs and Miscs

* Add Makefile for cli project ([#192](https://github.com/ray-project/kuberay/pull/192), @Jeffwan)
* Manifests and docs improvement for prerelease ([#191](https://github.com/ray-project/kuberay/pull/191), @Jeffwan)
* Add documentation for autoscaling feature ([#189](https://github.com/ray-project/kuberay/pull/189), @Jeffwan)
* docs: Fix typo in best practice ([#190](https://github.com/ray-project/kuberay/pull/190), @nakamasato)
* add kuberay on kind jupyter notebook ([#147](https://github.com/ray-project/kuberay/pull/147), @wilsonwang371)
* Add KubeRay release guideline ([#161](https://github.com/ray-project/kuberay/pull/161), @Jeffwan)
* Add troubleshooting guide for ray version mismatch ([#154](https://github.com/ray-project/kuberay/pull/154), @scarlet25151)
* Explanation and Best Practice for workers-head Reconnection ([#142](https://github.com/ray-project/kuberay/pull/142), @nostalgicimp)
* [docs] Folder name change to kuberay-operator ([#143](https://github.com/ray-project/kuberay/pull/143), @asm582)
* Improve the Helm charts docs ([#131](https://github.com/ray-project/kuberay/pull/131), @haoxins)
* add auto-scale doc ([#108](https://github.com/ray-project/kuberay/pull/108), @akanso)
* Add core API and backend service design doc ([#98](https://github.com/ray-project/kuberay/pull/98), @Jeffwan)
* [Feature] add more options in bug template ([#121](https://github.com/ray-project/kuberay/pull/121), @wilsonwang371)
* Rename service module to apiserver ([#118](https://github.com/ray-project/kuberay/pull/118), @Jeffwan)


## [v0.1.0](https://github.com/ray-project/kuberay/tree/v0.1.0) (2021-10-16)

### Feature
Expand Down
20 changes: 15 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,29 @@ KubeRay is an open source toolkit to run Ray applications on Kubernetes.
KubeRay provides several tools to improve running and managing Ray's experience on Kubernetes.

- Ray Operator
- Backend services to create/delete cluster resources (incubating)
- Kubectl plugin/CLI to operate CRD objects (future work)
- Backend services to create/delete cluster resources
- Kubectl plugin/CLI to operate CRD objects
- Data Scientist centric workspace for fast prototyping (incubating)
- Native Job and Serving integration with Clusters (incubating)
- Kubernetes event dumper for ray clusters/pod/services (future work)
- Operator Integration with Kubernetes node problem detector (future work)
- Kubernetes based workspace to easily submit ray jobs (future work)

## Quick Start

### Use Yaml

#### Nightly version

```
kubectl apply -k "github.com/ray-project/kuberay/manifests/cluster-scope-resources"
kubectl apply -k "github.com/ray-project/kuberay/manifests/base"
```

#### Stable version

```
kubectl apply -k manifests/cluster-scope-resources
kubectl apply -k manifests/base
kubectl apply -k "github.com/ray-project/kuberay/manifests/cluster-scope-resources?ref=v0.2.0"
kubectl apply -k "github.com/ray-project/kuberay/manifests/base?ref=v0.2.0"
```

### Use helm chart
Expand Down
153 changes: 153 additions & 0 deletions apiserver/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# KubeRay APIServer

KubeRay APIServer provides the gRPC and HTTP API to manage kuberay resources.

## Usage

### Compute Template

#### Create compute templates
```
POST {{baseUrl}}/apis/v1alpha1/compute_templates
```

```
{
"name": "default-template",
"cpu": 2,
"memory": 4,
"gpu": 1,
"gpuAccelerator": "Tesla-V100"
}
```

#### List all compute templates

```
GET {{baseUrl}}/apis/v1alpha1/compute_templates
```

```
{
"compute_templates": [
{
"id": "",
"name": "default-template",
"cpu": 2,
"memory": 4,
"gpu": 1,
"gpu_accelerator": "Tesla-V100"
}
]
}
```

#### Get compute template by name

```
GET {{baseUrl}}/apis/v1alpha1/compute_templates/?name=<compute_template_name>
```

#### Delete compute template by

```
DELETE {{baseUrl}}/apis/v1alpha1/compute_templates/?name=<compute_template_name>
```

### Clusters

#### Create cluster

```
POST {{baseUrl}}/apis/v1alpha1/clusters
```

payload
```
{
"name": "test-cluster",
"namespace": "ray-system",
"user": "jiaxin.shan",
"version": "1.9.2",
"environment": "DEV",
"clusterSpec": {
"headGroupSpec": {
"computeTemplate": "head-template",
"image": "ray.io/ray:1.9.2",
"serviceType": "NodePort",
"rayStartParams": {}
},
"workerGroupSepc": [
{
"groupName": "small-wg",
"computeTemplate": "worker-template",
"image": "ray.io/ray:1.9.2",
"replicas": 2,
"minReplicas": 0,
"maxReplicas": 5,
"rayStartParams": {}
}
]
}
}
```

#### List all clusters

```
GET {{baseUrl}}/apis/v1alpha1/clusters
```

```
{
"clusters": [
{
"name": "test-cluster",
"namespace": "ray-system",
"user": "jiaxin.shan",
"version": "1.9.2",
"environment": "DEV",
"cluster_spec": {
"head_group_spec": {
"compute_template": "head-template",
"image": "rayproject/ray:1.9.2",
"service_type": "NodePort",
"ray_start_params": {
"dashboard-host": "0.0.0.0",
"node-ip-address": "$MY_POD_IP",
"port": "6379"
}
},
"worker_group_sepc": [
{
"group_name": "small-wg",
"compute_template": "worker-template",
"image": "rayproject/ray:1.9.2",
"replicas": 2,
"min_replicas": 0,
"max_replicas": 5,
"ray_start_params": {
"node-ip-address": "$MY_POD_IP",
}
}
]
},
"created_at": "2022-03-13T15:13:09Z",
"deleted_at": null
},
]
}
```

#### Get cluster by name

```
GET {{baseUrl}}/apis/v1alpha1/clusters/?name=<cluster_name>
```


#### Delete cluster by name

```
DELETE {{baseUrl}}/apis/v1alpha1/clusters/?name=<cluster_name>
```
14 changes: 8 additions & 6 deletions apiserver/deploy/base/apiserver.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@ apiVersion: apps/v1
kind: Deployment
metadata:
name: kuberay-apiserver
labels:
control-plane: kuberay-operator
spec:
selector:
matchLabels:
app: kuberay-apiserver
control-plane: kuberay-apiserver
replicas: 1
template:
metadata:
labels:
app: kuberay-apiserver
control-plane: kuberay-apiserver
spec:
serviceAccountName: kuberay-apiserver
containers:
Expand Down Expand Up @@ -38,7 +40,7 @@ metadata:
spec:
type: NodePort
selector:
app: kuberay-apiserver
control-plane: kuberay-apiserver
ports:
- name: http
port: 8888
Expand All @@ -54,15 +56,15 @@ apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app: kuberay-apiserver
control-plane: kuberay-apiserver
name: kuberay-apiserver

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app: kuberay-apiserver
control-plane: kuberay-apiserver
name: kuberay-apiserver
roleRef:
apiGroup: rbac.authorization.k8s.io
Expand All @@ -78,7 +80,7 @@ apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app: kuberay-apiserver
control-plane: kuberay-apiserver
name: kuberay-apiserver
rules:
- apiGroups:
Expand Down
Loading

0 comments on commit dd458a2

Please sign in to comment.