Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: added new graph object for nim runtimes #334

Open
wants to merge 6 commits into
base: incubating
Choose a base branch
from

Conversation

TomerFi
Copy link
Contributor

@TomerFi TomerFi commented Dec 11, 2024

Description

Added new graph objects for NIM models.

Work doc
Jira: NVPE-51

How Has This Been Tested?

A custom image was deployed into our dev cluster. After deploying a NIM model to the cluster, I checked the relevant x-metrics-dashboard configmap for the graph objects. I proceeded to execute the instantiated queries against the model after loading it with invocations to provide data. See doc for more details.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link
Contributor

openshift-ci bot commented Dec 11, 2024

Hi @TomerFi. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

"queries": [
{
"title": "GPU cache usage over time",
"query": "TODO"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this be added for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we're looking for some clarifications about this graph. But this will be included in this PR ASAP.

Copy link
Contributor

openshift-ci bot commented Dec 18, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: TomerFi
Once this PR has been reviewed and has the lgtm label, please assign mwaykole for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -277,7 +277,7 @@ const (
"queries": [
{
"title": "GPU cache usage over time",
"query": "TODO"
"query": "round(sum(increase(gpu_cache_usage_perc{namespace='${NAMESPACE}', pod=~'${MODEL_NAME}-predictor-.*'}[${RATE_INTERVAL}])))"
Copy link
Member

@spolti spolti Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this rate-interval needs to be hardcoded, IIRC dashboard does not set it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set it to 1m here: https://github.com/opendatahub-io/odh-model-controller/blob/main/controllers/utils/utils.go#L404.

That's a good catch. I wanted to use the REQUEST_RATE_INTERVAL, which is 5m, but I mistakenly used the RATE_INTERVAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have more changes coming after yesterday's meeting. Pushing soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 7785941, I added a constant template replacement, "KV_CACHE_SAMPLING_RATE" with the value 24h for the KV cache sampling. I'm not sure we'll stick to that, but I don't want to block the frontend work on our development environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants