Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubeRay Operator add-on #849

Merged
merged 19 commits into from
Feb 29, 2024
Merged

KubeRay Operator add-on #849

merged 19 commits into from
Feb 29, 2024

Conversation

freschri
Copy link
Collaborator

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@freschri freschri changed the title Kuberay operator add-on KubeRay Operator add-on Sep 29, 2023
@freschri freschri linked an issue Sep 29, 2023 that may be closed by this pull request
1 task
Copy link
Collaborator

@elamaran11 elamaran11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freschri Nice work but i have few questions:

  1. Where is the ask from Partner side to build this addon though this will be an AWSome addition to Blueprints?
  2. Are you planning to build a complete pattern to show usage of this KubeRay with a acomplete workload. We can also build a pattern with observability for this? If so please create a issue in patterns and observability repo
  3. Update to doc index and mkdocs is missing too

docs/addons/kuberay-operator.md Show resolved Hide resolved
lib/addons/index.ts Show resolved Hide resolved
@freschri
Copy link
Collaborator Author

@freschri Nice work but i have few questions:

1. Where is the ask from Partner side to build this addon though this will be an AWSome addition to Blueprints?

2. Are you planning to build a complete pattern to show usage of this KubeRay with a acomplete workload. We can also build a pattern with observability for this? If so please create a issue in patterns and observability repo

3. Update to doc index and mkdocs is missing too

the idea is to build the cdk equivalent of DoEKS's JARK stack, also presented here: https://aws.amazon.com/blogs/containers/deploy-generative-ai-models-on-amazon-eks/

Copy link
Collaborator

@elamaran11 elamaran11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freschri Doc index is still not updated so this wont list in list of addons page. Also can you make sure you test the addon with blueprint-construction running it whole and let us know if it works. We can then run e2e

elamaran11
elamaran11 previously approved these changes Sep 30, 2023
Copy link
Collaborator

@elamaran11 elamaran11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @shapirov103 Please check from your end.

Copy link
Collaborator

@shapirov103 shapirov103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freschri looks great, a couple of comments.

import { Construct } from 'constructs';
import * as blueprints from '@aws-quickstart/eks-blueprints';

export class DatadogConstruct extends Construct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using DatadogConstruct here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry copy paste error. done.

/**
* User provided options for the Helm Chart
*/
export interface KubeRayAddOnProps extends HelmAddOnUserProps {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any common values that customers should configure when deploying this addon, that would make sense to promote to this struct and make explicit for the customers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are no config values, please see here: https://ray-project.github.io/kuberay/deploy/helm/

Copy link
Collaborator

@shapirov103 shapirov103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freschri please see my comment. we are targetting 1.13 release this week, there is a chance this addon can make it in.

name: "kuberay-operator",
chart: "kuberay-operator",
namespace: "default",
version: "1.0.0-rc.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use an RC version for the release? I see the have 1.0.0 - let's use that. We generally avoid using non-GA versions as default for the addon (with rare exceptions).

@shapirov103
Copy link
Collaborator

@freschri please igore the markdown broken links in the above check. We are addressing in a separate PR.

@elamaran11
Copy link
Collaborator

@freschri Are you planning to close this PR comments?

@shapirov103
Copy link
Collaborator

@freschri let's address minor feedback and please push something to retrigger the GH actions. I suppose the MD check that is failing should succeed now.

@freschri
Copy link
Collaborator Author

freschri commented Feb 5, 2024

@freschri let's address minor feedback and please push something to retrigger the GH actions. I suppose the MD check that is failing should succeed now.

@shapirov103 done

@shapirov103
Copy link
Collaborator

/do-e2e-tests

Copy link

@aws-ia-ci aws-ia-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end tests failed. A maintainer can provide more details.

@shapirov103
Copy link
Collaborator

/do-e2e-tests

Copy link

@aws-ia-ci aws-ia-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end tests failed. A maintainer can provide more details.

@shapirov103
Copy link
Collaborator

@freschri I am getting failure on stack destroy consistently, hence it cannot pass e2e. Have you tried dropping (deleting) the blueprint stack with this addon in place?

@shapirov103
Copy link
Collaborator

/do-e2e-tests

Copy link

@aws-ia-ci aws-ia-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end tests failed. A maintainer can provide more details.

failing after conflict resolution
@shapirov103
Copy link
Collaborator

/do-e2e-tests

Copy link

@aws-ia-ci aws-ia-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end tests failed. A maintainer can provide more details.

@shapirov103
Copy link
Collaborator

consistently leaves one ENI behind in a secondary subnet. this will have to be resolved before we include it in the release. @freschri

@shapirov103
Copy link
Collaborator

/do-e2e-tests

Copy link

@aws-ia-ci aws-ia-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end tests passed

Copy link
Collaborator

@shapirov103 shapirov103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shapirov103 shapirov103 merged commit 187fec4 into main Feb 29, 2024
2 checks passed
@shapirov103 shapirov103 deleted the kuberay-operator-addon branch February 29, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KubeRay Operator add-on
4 participants