Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NIM flag logic #312

Merged

Conversation

trujillm
Copy link
Contributor

Add NIM flag logic

Description

This will get the nim flag state of removed or managed from params.env and add to the environment variables for odh-model-controller to be utilize

How Has This Been Tested?

Unit tests

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link
Contributor

openshift-ci bot commented Nov 26, 2024

Hi @trujillm. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

config/base/params.env Outdated Show resolved Hide resolved
@trujillm
Copy link
Contributor Author

@spolti do you know if I need to change something for the image tag failure?

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, change the target branch to incubating.
The main branch is now a stable/downstream branch.

@Jooho Jooho changed the base branch from main to incubating November 26, 2024 17:33
@trujillm trujillm requested a review from israel-hdez November 26, 2024 18:01
tgis-image=quay.io/opendatahub/text-generation-inference:stable-eba83ba
ovms-image=quay.io/opendatahub/openvino_model_server:2024.3-release-4c8c52c
vllm-image=quay.io/opendatahub/vllm:stable-849f0f5
nim-state=removed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in opendatahub-io/opendatahub-operator#1330, if not specify nim-state in DSC, it should pass down "managed" as default.
but here, if not get value from Operator, the default is removed.
isnt that inconsistent?
if the requirement from UX is, unless user explicitly set "removed" in DSC, then it is always "managed", then it should be nim-state=managed here, right?
in reality, it does not matter managed/remove/ "" in params.env since it will always be overrwritten by Operator. But if run odh-model-controller without Operator, then set "removed" here is different than "managed" also cause confusion in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zdtsw I think the intent here was to add removed explicitly as a safety in case someone ever decides to change in params.env without taking modlemesh into consideration.

@Jooho
Copy link
Contributor

Jooho commented Nov 29, 2024

/retest

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trujillm I reviewed the commit and it is OK to me. Just, please, rebase on top of incubation to bring only your changes.

@@ -1 +1 @@
vllm-gaudi-image=quay.io/opendatahub/vllm:fast-gaudi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trujillm may you rebase your PR on incubation?
Looks like you did your changes on top of the main branch, which is no longer our default branch, and there are a lot of changes because of the different base branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@israel-hdez I believe I have corrected please verify and let me know if you need anything else from my side

@israel-hdez
Copy link
Contributor

/ok-to-test

Signed-off-by: mtrujillo <[email protected]>
Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Dec 2, 2024
Copy link
Contributor

openshift-ci bot commented Dec 2, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, spolti, trujillm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 9d6de07 into opendatahub-io:incubating Dec 2, 2024
6 checks passed
openshift-merge-bot bot pushed a commit that referenced this pull request Jan 16, 2025
* update global ca bundle logic and storage-config logic to follow up odh operator pr(1339) (#308)

Signed-off-by: jooho lee <[email protected]>

* disable dashboard and fix servingruntime display name

Signed-off-by: jooho lee <[email protected]>

* Use the main branch to build stable image tags, incubating for latest image tags (#316)

Signed-off-by: Hannah DeFazio <[email protected]>

* [RHOAIENG-13638] - Do not allow isvc creation in protected isvc (#311)

* [RHOAIENG-13638] - Do not allow isvc creation in protected namespace

chore: Fixes [RHOAIENG-13638] - Kserve model is not Ready after a kserve model is created and deleted from istio-system namespace

Signed-off-by: Spolti <[email protected]>

* review suggestions

Signed-off-by: Spolti <[email protected]>

* Update controllers/webhook/isvc_validator.go

Co-authored-by: Edgar Hernández <[email protected]>
Signed-off-by: Spolti <[email protected]>

---------

Signed-off-by: Spolti <[email protected]>
Co-authored-by: Edgar Hernández <[email protected]>

* update gitaction based on branch strategy change (#322)

Signed-off-by: jooho lee <[email protected]>

* feat: added performance metric grpahs config for nvidia nim (#320)

* feat: added performance metric grpahs config for nvidia nim

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: modifyed the runtime id annotation

Co-authored-by: Edgar Hernández <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>

---------

Signed-off-by: Tomer Figenblat <[email protected]>
Co-authored-by: Edgar Hernández <[email protected]>

* Add NIM flag logic (#312)

Signed-off-by: mtrujillo <[email protected]>

* Grab the old release tag based on creation date

Signed-off-by: Hannah DeFazio <[email protected]>

* Updated the checkout code command

Signed-off-by: Mariah Holder <[email protected]>

* Updated the checkout code command (#329)

Signed-off-by: Mariah Holder <[email protected]>
Co-authored-by: Mariah Holder <[email protected]>

* Add reconciliation for Kserve Raw (#274)

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>

* chore: added pagination support for nim catalog response (#332)

Signed-off-by: Tomer Figenblat <[email protected]>

* feat(mr): enable model registry inference reconcile (#326)

Signed-off-by: Alessio Pragliola <[email protected]>

* add upstream release metadata (#333)

Signed-off-by: heyselbi <[email protected]>

* Migration to kubebuilder v4 (#324)

* Migration to kubebuilder v4

Signed-off-by: Edgar Hernández <[email protected]>

* Restore MR E2Es

Signed-off-by: Edgar Hernández <[email protected]>

* Restore top-level files

Signed-off-by: Edgar Hernández <[email protected]>

* Cleaning

Signed-off-by: Edgar Hernández <[email protected]>

* Fixing Makefile and Containerfile

Signed-off-by: Edgar Hernández <[email protected]>

* Linter fixes

Signed-off-by: Edgar Hernández <[email protected]>

* Initial rework of manifests

Signed-off-by: Edgar Hernández <[email protected]>

* Fix manifests

Signed-off-by: Edgar Hernández <[email protected]>

* Fix lint issues

Signed-off-by: Edgar Hernández <[email protected]>

* Deactivate E2Es

Because setup is not automated, yet.

Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe

Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe

Test differences after `go mod tidy`

Signed-off-by: Edgar Hernández <[email protected]>

* Apply suggestions from code review: Filippe

Co-authored-by: Filippe Spolti <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe

* Pin go-toolset base image in Containerfile.
* Add `gosec` linter

Signed-off-by: Edgar Hernández <[email protected]>

* Update config/prometheus/monitor.yaml

Co-authored-by: Filippe Spolti <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe

* Small change to comments in Makefile, to make the text clearer.
* Remove (again) `gosec` linter

Signed-off-by: Edgar Hernández <[email protected]>

* Fix panic on controller startup

Signed-off-by: Edgar Hernández <[email protected]>

---------

Signed-off-by: Edgar Hernández <[email protected]>
Co-authored-by: Filippe Spolti <[email protected]>

* chore: use naming convention for resources created by nim (#340)

* chore: use naming convention for resources created by nim

Signed-off-by: Tomer Figenblat <[email protected]>

* test: added assertions for dyamic nim resources name

Signed-off-by: Tomer Figenblat <[email protected]>

---------

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: set nim runtime api call page size to 1000 (#344)

Signed-off-by: Tomer Figenblat <[email protected]>

* Nim enablement change default to managed and add clean up job (#342)

* initial commit for clean up of nim and managed set as default

Signed-off-by: mtrujillo <[email protected]>

* remove space

Signed-off-by: mtrujillo <[email protected]>

* fix code length for linting

Signed-off-by: mtrujillo <[email protected]>

* fixed comments / adjusted import

Signed-off-by: mtrujillo <[email protected]>

---------

Signed-off-by: mtrujillo <[email protected]>

* chore: added new graph object for nim runtimes (#334)

* chore: added new graph object for nim runtimes

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: added REQUEST_OUTCOMES nim graph

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: added fixed typo in nim query object

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: fixed typo in nim query object

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: added initial query for nim gpu cache usage

Signed-off-by: Tomer Figenblat <[email protected]>

* chore: rewrite queries for nim new graphs

Signed-off-by: Tomer Figenblat <[email protected]>

---------

Signed-off-by: Tomer Figenblat <[email protected]>

* Update ovms to current build (#343)

Signed-off-by: Steve Grubb <[email protected]>
Co-authored-by: Steve Grubb <[email protected]>

* Automatically inject expected ODH annotations to InferenceGraph and InferenceServices (#339)

* Implementation of ODH defaulters for InferenceGraph and InferenceService

On creation of InferenceGraph or InferenceService resources, the following default annotations will be added:
* `serving.knative.openshift.io/enablePassthrough: true`
* `sidecar.istio.io/inject: true`
* `sidecar.istio.io/rewriteAppHTTPProbers: true`

The annotations are added only for Serverless mode, and only if they are missing.

Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe

Extract "ENABLE_WEBHOOKS" string to constant

Signed-off-by: Edgar Hernández <[email protected]>

---------

Signed-off-by: Edgar Hernández <[email protected]>

* Authorization for InferenceGraph (Serverless) (#345)

* Authorization for InferenceGraph (Serverless)

This adds a new controller for KServe InferenceGraph resources. This new controller will have the responsibility of creating Authorino AuthConfig resources (similarly to InferenceServices case), when authorization is available in ODH platform.

InferenceGraphs can now be annotated with `security.opendatahub.io/enable-auth: "true"` to secure InferenceGraphs and only serve requests that are authorized.

Signed-off-by: Edgar Hernández <[email protected]>

* Feedback: Filippe - Event when auth is not available

Signed-off-by: Edgar Hernández <[email protected]>

---------

Signed-off-by: Edgar Hernández <[email protected]>

* [RHOAIENG-10293] add metrics resources for rawdeployment (#347)

* [RHOAIENG-10293] add metrics resources for rawdeployment

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>

* [RHOAIENG-10293] address feedback

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>

---------

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>

* [RHOAIENG-16851] rawdeployment route bug fixes (#341)

Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>

* fix null pointer error (RHOAIENG-18228) (#349)

Signed-off-by: jooho lee <[email protected]>

* remove old file

Signed-off-by: jooho lee <[email protected]>

update go.mod

Signed-off-by: jooho lee <[email protected]>

---------

Signed-off-by: jooho lee <[email protected]>
Signed-off-by: Hannah DeFazio <[email protected]>
Signed-off-by: Spolti <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
Signed-off-by: mtrujillo <[email protected]>
Signed-off-by: Mariah Holder <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Signed-off-by: Alessio Pragliola <[email protected]>
Signed-off-by: heyselbi <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Steve Grubb <[email protected]>
Co-authored-by: Hannah DeFazio <[email protected]>
Co-authored-by: Filippe Spolti <[email protected]>
Co-authored-by: Edgar Hernández <[email protected]>
Co-authored-by: Tomer Figenblat <[email protected]>
Co-authored-by: Marcus Trujillo <[email protected]>
Co-authored-by: Mariah Holder <[email protected]>
Co-authored-by: Mariah Holder <[email protected]>
Co-authored-by: Vedant Mahabaleshwarkar <[email protected]>
Co-authored-by: Tomer Figenblat <[email protected]>
Co-authored-by: Alessio Pragliola <[email protected]>
Co-authored-by: Selbi Nuryyeva <[email protected]>
Co-authored-by: Steven Grubb <[email protected]>
Co-authored-by: Steve Grubb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants