Documentation and example for running simple NLP service on kuberay #1340

gvspraveen · 2023-08-17T00:07:54Z

Why are these changes needed?

This is needed for Kuberay CUJ testing

Related issue number

Checks

Manually tested

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- This PR is not tested :(

docs/guidance/aws-eks-gpu-cluster.md

docs/guidance/text-summarizer-rayservice.md

kevin85421 · 2023-08-17T00:41:26Z

docs/guidance/text-summarizer-rayservice.md

+
+Note that the RayService's Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.
+
+## Step 5: Send a request to the text-to-image model


text-to-image -> text summarization (?)

docs/guidance/text-summarizer-rayservice.md

ray-operator/config/samples/ray-service.text-sumarizer.yaml

Co-authored-by: Kai-Hsun Chen <[email protected]> Signed-off-by: Praveen <[email protected]>

kevin85421 · 2023-08-17T19:49:47Z

docs/guidance/text-summarizer-rayservice.md

+
+This RayService configuration contains some important settings:
+
+* Its `tolerations` for workers match the taints on the GPU node group (which has taints), so they can be scheduled on either GPU or CPU node. We don't add these to head nodes to head node from being allocated to GPU node.


The tolerations for workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we set nvidia.com/gpu: 1 in the Pod's resource configurations.

kevin85421 · 2023-08-17T19:50:15Z

docs/guidance/stable-diffusion-rayservice.md

@@ -21,7 +21,7 @@ kubectl apply -f ray-service.stable-diffusion.yaml

 This RayService configuration contains some important settings:

-* Its `tolerations` for workers match the taints on the GPU node group. Without the tolerations, worker Pods won't be scheduled on GPU nodes.
+* Its `tolerations` for workers match the taints on the GPU node group (which has taints), so they can be scheduled on either GPU or CPU node. We don't add these to `headGroupSpec` to make sure head Pod & KubeRay operator Pod are not allocated to GPU node group (which has taints).


The tolerations for workers allow them to be scheduled on nodes without any taints or on nodes with specific taints. However, workers will only be scheduled on GPU nodes because we set nvidia.com/gpu: 1 in the Pod's resource configurations.

docs/guidance/stable-diffusion-rayservice.md

Signed-off-by: Kai-Hsun Chen <[email protected]>

kevin85421

LGTM

…ay-project#1340) * add service yaml for nlp * Documentation fixes * Fix instructions * Apply suggestions from code review Co-authored-by: Kai-Hsun Chen <[email protected]> Signed-off-by: Praveen <[email protected]> * Fix tolerations comment * review comments * Update docs/guidance/stable-diffusion-rayservice.md Signed-off-by: Kai-Hsun Chen <[email protected]> --------- Signed-off-by: Praveen <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]>

gvspraveen added 3 commits August 16, 2023 16:31

add service yaml for nlp

cc30e52

Documentation fixes

646dc41

Fix instructions

da71665

kevin85421 reviewed Aug 17, 2023

View reviewed changes

gvspraveen and others added 2 commits August 16, 2023 19:24

Apply suggestions from code review

b3b6044

Co-authored-by: Kai-Hsun Chen <[email protected]> Signed-off-by: Praveen <[email protected]>

Fix tolerations comment

727c922

kevin85421 reviewed Aug 17, 2023

View reviewed changes

review comments

596b28a

kevin85421 reviewed Aug 17, 2023

View reviewed changes

docs/guidance/stable-diffusion-rayservice.md Outdated Show resolved Hide resolved

Update docs/guidance/stable-diffusion-rayservice.md

d077c71

Signed-off-by: Kai-Hsun Chen <[email protected]>

kevin85421 approved these changes Aug 17, 2023

View reviewed changes

gvspraveen merged commit 1cbac51 into ray-project:master Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation and example for running simple NLP service on kuberay #1340

Documentation and example for running simple NLP service on kuberay #1340

gvspraveen commented Aug 17, 2023

kevin85421 Aug 17, 2023

kevin85421 Aug 17, 2023

kevin85421 Aug 17, 2023

kevin85421 left a comment


		Note that the RayService's Kubernetes service will be created after the Serve applications are ready and running. This process may take approximately 1 minute after all Pods in the RayCluster are running.

		## Step 5: Send a request to the text-to-image model


		This RayService configuration contains some important settings:

		* Its `tolerations` for workers match the taints on the GPU node group (which has taints), so they can be scheduled on either GPU or CPU node. We don't add these to head nodes to head node from being allocated to GPU node.

Documentation and example for running simple NLP service on kuberay #1340

Documentation and example for running simple NLP service on kuberay #1340

Conversation

gvspraveen commented Aug 17, 2023

Why are these changes needed?

Related issue number

Checks

kevin85421 Aug 17, 2023

Choose a reason for hiding this comment

kevin85421 Aug 17, 2023

Choose a reason for hiding this comment

kevin85421 Aug 17, 2023

Choose a reason for hiding this comment

kevin85421 left a comment

Choose a reason for hiding this comment