Releases: Jeffwan/kuberay
Releases · Jeffwan/kuberay
v0.3.0-rc.0-test
RayOperator (beta)
Service
- [Serve] Unify logger and add user facing events (#378, @simon-mo)
- Improve RayService Operator logic to handle head node crash (#376, @brucez-anyscale)
- Add serving service for users traffic with health check (#367, @brucez-anyscale)
- Create a service for dashboard agent (#324, @brucez-anyscale)
- Update RayService CR to integrate with Ray Nightly (#322, @brucez-anyscale)
- RayService: zero downtime update and healthcheck HA recovery (#307, @brucez-anyscale)
- RayService: Dev RayService CR and Controller logic (#287, @brucez-anyscale)
- KubeRay: kubebuilder creat RayService Controller and CR (#270, @brucez-anyscale)
Job
- Add RayJob CRD and controller logic (#303, @harryge00)
HA
- Initial support for external Redis and GCS HA (#294, @wilsonwang371)
Autoscaler
- Update autoscaler image (#371, @DmitriGekhtman)
- [minor] Update autoscaler image. (#313, @DmitriGekhtman)
- Provide override for autoscaler image pull policy. (#297, @DmitriGekhtman)
- [RFC][autoscaler] Add autoscaler container overrides and config options for scale behavior. (#278, @DmitriGekhtman)
- [autoscaler] Improve autoscaler auto-configuration, upstream recent improvements to Kuberay NodeProvider (#274, @DmitriGekhtman)
Others
- [Bug] Fix raycluster updatestatus list wrong label (#377, @scarlet25151)
- Make replicas optional for the head spec. (#362, @DmitriGekhtman)
- Add ray head service endpoints in status for expose raycluster's head node endpoints (#341, @scarlet25151)
- Support KubeRay management labels (#345, @Jeffwan)
- fix: bug in object store memory validation (#332, @davidxia)
- feat: add EventReason type for events (#334, @davidxia)
- minor refactor: fix camel-casing of unHealthy -> unhealthy (#333, @davidxia)
- refactor: remove redundant imports (#317, @davidxia)
- Fix GPU-autofill for rayStartParams (#328, @DmitriGekhtman)
- ray-operator: add missing space in controller log messages (#316, @davidxia)
- fix: use head group's ServiceAccount in autoscaler RoleBinding (#315, @davidxia)
- fix typos in comments and help messages (#304, @davidxia)
- enable force cluster upgrade (#231, @wilsonwang371)
- fix operator: correctly set head pod service account (#276, @davidxia)
- [hotfix] Fix Service account typo (#285, @DmitriGekhtman)
- Rename RayCluster folder to Ray since the group is Ray (#275, @brucez-anyscale)
- KubeRay: Relocate files to enable controller extension with Kubebuilder (#268, @brucez-anyscale)
- fix: use configured RayCluster service account when autoscaling (#259, @davidxia)
- suppress not found errors into regular logs (#222, @akanso)
- adding label check (#221, @akanso)
- Prioritize WorkersToDelete (#208, @sriram-anyscale)
- Simplify k8s client creation (#179, @chenk008)
- [ray-operator]Make log timestamp readable (#206, @chenk008)
- bump controller-runtime to 0.11.1 and Kubernetes to v1.23 (#180, @chenk008)
APIServer
- Enable DefaultHTTPErrorHandler and Upgrade grpc-gateway to v2 (#369, @Jeffwan)
- Validate namespace consistency in the request when creating the cluster and the compute template (#365, @daikeshi)
- Update compute template service url to include namespace path param (#363, @Jeffwan)
- fix apiserver created raycluster metrics port missing and check (#356, @scarlet25151)
- Support mounting volumes in API request (#346, @Jeffwan)
- add standard label for the filtering of cluster (#342, @scarlet25151)
- expose kubernetes events in apiserver (#343, @scarlet25151)
- Update ray-operator version in the apiserver (#340, @pingsutw)
- fix: typo worker_group_sepc -> worker_group_spec (#330, @davidxia)
- Fix gpu-accelerator in template (#296, @armandpicard)
- Add namespace scope to compute template operations (#244, @daikeshi)
- Add namespace scope to list operation (#237, @daikeshi)
- Add namespace scope for Ray cluster get and delete operations (#229, @daikeshi)
CLI
Deployment (kubernetes & helm)
- modify kuberay operator crds in kuberay operator chart and add apiserver chart (#354, @scarlet25151)
- Warn explicitly against using kubectl apply to create RayCluster CRD. (#302, @DmitriGekhtman)
- Sync crds to Helm chart (#280, @haoxins)
- [Feature]Run kuberay in a single namespace (#258, @wilsonwang371)
- fix duplicated port config and manager.yaml missing config (#250, @wilsonwang371)
- manifests: Add live/ready probes (#243, @haoxins)
- Helm: supports custom probe seconds (#239, @haoxins)
- Add CD for helm charts (#199, @ddelange)
Build and Testing
- fix flaky test issue (#370, @wilsonwang371)
- provide more detailed information in case of test failures (#352, @wilsonwang371)
- fix wrong kuberay image used by compatibility test (#327, @wilsonwang371)
- add cluster nodes info test (#299, @wilsonwang371)
- Fix the image name in deploy cmd (#293, @brucez-anyscale)
- [CI]enable ci test to check ctrl plane health state (#279, @wilsonwang371)
- [bugfix]update flaky test timeout (#254, @wilsonwang371)
- Update format by running gofumpt (#236, @wilsonwang371)
- Add unit tests for raycluster_controller reconcilePods function (#219, @Waynegates)
- Support ray 1.12 (#245, @wilsonwang371)
- add 1.11 to compatibility test and update comment (#217, @wilsonwang371)
- run compatibility in parallel using multiple workflows (#215, @wilsonwang371)
Monitoring
- add-state-machine-and-exposing-port (#319, @scarlet25151)
- Install: Fix directory path for prometheus install.sh (#256, @Tomcli)
- Fix Ray Operator prometheus config (#253, @Tomcli)
- Emit prometheus metrics from kuberay control plane (#232, @Jeffwan)
- Enable metrics-export-port by default and configure prometheus monitoring (#230, @scarlet25151)