Node classification mechnism based on workload profiles #2461

digambar15 · 2020-02-26T23:57:10Z

User Story

As a user/operator I would like to introduce mechnism which will classify nodes based on workload profiles so that I will get right match of node before deployment any workloads/applications.

Detailed Description

Custer-api will create multiple clusters at any cloud be it VMware, GCP, AWS or Baremetal etc.
If some user is looking for worker-node where he will run his workloads on pods. But he has very specific requirements like he wants node should have minimum Memory/Storage/CPU's so that he can deploy his app on the pod with specific node Or in some cases, he has very specific hardware configuration for node which is required for his workloads, in that case, he has to go through every cluster and find right match would be difficult job for him.

So I am proposing here classification framework where user will define what of hardware configuration required for his workload, this CRD/mechanism will filter out all the worker nodes where it will find right match to his/her workload profile.

This approach will help not only filtering the hosts, but also will help implement many more classification filters/algorithms which will make users life easy in future.

/kind feature

detiber · 2020-02-27T02:57:32Z

@digambar15 I'm trying to understand the requirements/UX you are looking for here.

At a high level, I would say that this feels very much analogous to node feature discovery, however that would only be helpful in the context of a single workload cluster.

We could potentially replicate the labels that node-feature-discovery adds to Nodes to the Machines/MachinePools that we manage in Cluster API, but that still feels like an awkward UX to me for the end user (query all Machines in Clusters that I have access to and filter on these labels), just to find the right Cluster to target and then replicate those labels as a Node Selector on the deployment that I create.

I feel like to achieve a good user experience around this, we need to target at least a portion of the functionality towards either developer tooling or an application management layer. With that being the case, it is not too much of a stretch to also have that layer do the Machine -> Node mapping for the node feature detection labels rather than relying on Cluster API to ensure that the labels are in sync on a Machine resource.

vincepri · 2020-03-02T17:38:29Z

/milestone Next
/kind design
/priority awaiting-more-evidence

digambar15 · 2020-03-12T12:53:07Z

Hey @detiber, @vincepri,

Sorry for late reply.

As an Operator or User, I have some kind of workloads / application needs very specific hardware configuration on my cluster. Find a right matching host in cluster will be headache and on top of this, we will have to go and check whether we have matched hardware configuration available on that host or not.
Agree, this idea is relying on cluster-api but not needed any modification in cluster-api directly.
but looking at future requirements, we need some kind of classification framework to filter out hosts for this kind scenarios. I am thinking in this cases, we should fetch only unowned hosts under MachineSet and label them for specific hardware/workload profile.
In future, we can add many classification filter like -
Minimum, RAMFilter, DiskFilter, CPUFilter, GPUFilter etc

fejta-bot · 2020-06-10T13:20:00Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

randomvariable · 2020-06-17T12:10:52Z

/lifecycle frozen

vincepri · 2020-10-22T16:25:31Z

Closing this in favor of #3150

This information might be exposed in the future, and folks should be able to use it outside of Cluster API

vincepri · 2020-10-22T16:25:35Z

/close

k8s-ci-robot · 2020-10-22T16:25:44Z

@vincepri: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 26, 2020

k8s-ci-robot added this to the Next milestone Mar 2, 2020

k8s-ci-robot added kind/design Categorizes issue or PR as related to design. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Mar 2, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 10, 2020

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 17, 2020

seh mentioned this issue Jun 17, 2020

Cluster Autoscaler CAPI provider should support scaling to and from zero nodes kubernetes/autoscaler#3150

Closed

vincepri removed kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. labels Oct 22, 2020

k8s-ci-robot closed this as completed Oct 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node classification mechnism based on workload profiles #2461

Node classification mechnism based on workload profiles #2461

digambar15 commented Feb 26, 2020 •

edited

Loading

detiber commented Feb 27, 2020

vincepri commented Mar 2, 2020

digambar15 commented Mar 12, 2020 •

edited

Loading

fejta-bot commented Jun 10, 2020

randomvariable commented Jun 17, 2020

vincepri commented Oct 22, 2020

vincepri commented Oct 22, 2020

k8s-ci-robot commented Oct 22, 2020

Node classification mechnism based on workload profiles #2461

Node classification mechnism based on workload profiles #2461

Comments

digambar15 commented Feb 26, 2020 • edited Loading

detiber commented Feb 27, 2020

vincepri commented Mar 2, 2020

digambar15 commented Mar 12, 2020 • edited Loading

fejta-bot commented Jun 10, 2020

randomvariable commented Jun 17, 2020

vincepri commented Oct 22, 2020

vincepri commented Oct 22, 2020

k8s-ci-robot commented Oct 22, 2020

digambar15 commented Feb 26, 2020 •

edited

Loading

digambar15 commented Mar 12, 2020 •

edited

Loading