Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate NodeResourceTopology API #1

Closed

Conversation

swatisehgal
Copy link

@swatisehgal swatisehgal commented Oct 13, 2023

What this PR does / why we need it:
This API is needed for Topology aware scheduling work in Kubernetes and it was suggested that a CRD based API in staging is not ideal and it would be better to find a more suitable location under Kubernetes umbrella (under kubernetes-sigs). The repo was created under kubernetes-sigs (kubernetes/org#4224) and this PR populates the repo.

Background information/ Additional context
NFD Topology Updater in Node feature Discovery and Topology aware scheduler plugin in scheduler-plugins are two components for enabling Topology aware scheduling in Kubernetes (additional details about these components can be found below). These two projects are based in kubernetes-sigs organisation and depend on https://github.com/k8stopologyawareschedwg/noderesourcetopology-api which is not operated by Kubernetes github management team.

Additional information about NFD Topology Updater and scheduler plugin

  1. NFD Topology Updater, a daemon depends on NodeResourceTopology API to expose resource hardware topology on a per NUMA basis via CRs that correspond to all the nodes in the cluster.
  2. Topology aware scheduler plugin, an out of tree scheduler plugin depends on NodeResourceTopology API to consumed the CRs that corresponds to nodes (created by NFD Topology Updater) to make an intelligent NUMA aware scheduling decision.

Why do we need Topology aware scheduling in Kubernetes?
With the introduction of Topology Manager in Kubelet, pods could be scheduled on a node where total amount of resources available satisfy the resources requested, but resource distribution could not satisfy the appropriate Topology Manager policy leading to pod admission failure. Topology aware scheduler aims to empower scheduler to make intelligent topology aware placement decisions optimizing cluster wide performance of workloads.

Which issue(s) this PR fixes:
kubernetes/community#6308
kubernetes/kubernetes#84869

Additional notes/documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Signed-off-by: Swati Sehgal <[email protected]>
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: swatisehgal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. labels Oct 13, 2023
@swatisehgal swatisehgal force-pushed the k8s-sigs-nrt-api branch 2 times, most recently from 7cdce58 to 8404673 Compare October 13, 2023 13:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 13, 2023
@swatisehgal
Copy link
Author

/hold
to prevent inadvertent merge

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2023
Copy link

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-binding LGTM, want to have another pass later
question inside

go.mod Outdated
Comment on lines 7 to 9
k8s.io/apimachinery v0.22.3
k8s.io/client-go v0.22.3
k8s.io/code-generator v0.22.3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why so old versions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will bump the deps to newer version, Thanks for pointing this out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added additional commits but if squashing of commits is preferred, I can do that.

@swatisehgal swatisehgal force-pushed the k8s-sigs-nrt-api branch 2 times, most recently from ae94654 to 6dd5f63 Compare October 18, 2023 17:48
Copy link

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @swatisehgal for the PR. And apologies for the slow response time 🙄

Looks generally good, just a few small comments below.

README.md Outdated Show resolved Hide resolved
go.mod Show resolved Hide resolved
pkg/apis/topology/v1alpha1/register.go Outdated Show resolved Hide resolved
manifests/cr-instance-example.yaml Show resolved Hide resolved
Signed-off-by: Swati Sehgal <[email protected]>
Signed-off-by: Swati Sehgal <[email protected]>
Signed-off-by: Swati Sehgal <[email protected]>
Signed-off-by: Swati Sehgal <[email protected]>
@swatisehgal
Copy link
Author

/assign @liggitt @thockin
for API reviews based on the suggestion here.

@vsoch
Copy link

vsoch commented Dec 19, 2023

hey folks! My team is interested in this work (and I could help if needed). Is there a status update / what are next steps?

@swatisehgal
Copy link
Author

swatisehgal commented Dec 19, 2023

hey folks! My team is interested in this work (and I could help if needed). Is there a status update / what are next steps?

Hi @vsoch The current status is that we are waiting for this PR to be reviewed by Kubernetes API reviewers. I am conscious that we are close to the holiday season and a lot of people are already out so will be actively pursuing soliciting of reviews from API reviewers in the beginning of new year.

If you have bandwidth and would like, you are more than welcome to review the PR and let us know your feeedback.

Copy link

@vsoch vsoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments - still reviewing!

API.md Outdated Show resolved Hide resolved
API.md Show resolved Hide resolved
API.md Show resolved Hide resolved
API.md Show resolved Hide resolved
API.md Show resolved Hide resolved
We are getting the following warnings if we continue using generate-groups.sh
so updating to kube_codegen.sh instead as per the recommendation:

```
WARNING: generate-groups.sh is deprecated.
WARNING: Please use k8s.io/code-generator/kube_codegen.sh instead.
```

Signed-off-by: Swati Sehgal <[email protected]>
@swatisehgal swatisehgal force-pushed the k8s-sigs-nrt-api branch 2 times, most recently from a39ed45 to 20b997d Compare January 25, 2024 12:50
@swatisehgal
Copy link
Author

@liggitt A gentle ping for api reviews.

@liggitt
Copy link

liggitt commented Mar 14, 2024

will route this one to @thockin, it is much more aligned with the DRA reviews he's been doing and he has the topology stuff much more in his head

Present in https://github.com/orgs/kubernetes/projects/169/views/3?sliceBy%5Bvalue%5D=thockin

@liggitt liggitt removed their assignment Mar 14, 2024
@swatisehgal
Copy link
Author

will route this one to @thockin, it is much more aligned with the DRA reviews he's been doing and he has the topology stuff much more in his head

Present in https://github.com/orgs/kubernetes/projects/169/views/3?sliceBy%5Bvalue%5D=thockin

Thanks!

@thockin
Copy link

thockin commented Mar 15, 2024 via email

@vsoch
Copy link

vsoch commented Mar 15, 2024

He's leaaaavin' on a jet plane... ✈️

@swatisehgal
Copy link
Author

Hi Swati, I am about to get on a plane for KubeCon - is this OK to wait until after?

Hi Tim, sure no problem. Have a great KubeCon!

@swatisehgal
Copy link
Author

@thockin A gentle ping to review this PR. Thanks!

allocatable: 51
available: 51
capacity: 52
name: node-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  name:     node-1

Same to line 12.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2024
@SergeyKanzhelev
Copy link
Member

/assign @johnbelamaric

@SergeyKanzhelev
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 26, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 23, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.