Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node classification mechnism based on workload profiles #2461

Closed
digambar15 opened this issue Feb 26, 2020 · 8 comments
Closed

Node classification mechnism based on workload profiles #2461

digambar15 opened this issue Feb 26, 2020 · 8 comments
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.

Comments

@digambar15
Copy link

digambar15 commented Feb 26, 2020

User Story

As a user/operator I would like to introduce mechnism which will classify nodes based on workload profiles so that I will get right match of node before deployment any workloads/applications.

Detailed Description

Custer-api will create multiple clusters at any cloud be it VMware, GCP, AWS or Baremetal etc.
If some user is looking for worker-node where he will run his workloads on pods. But he has very specific requirements like he wants node should have minimum Memory/Storage/CPU's so that he can deploy his app on the pod with specific node Or in some cases, he has very specific hardware configuration for node which is required for his workloads, in that case, he has to go through every cluster and find right match would be difficult job for him.

So I am proposing here classification framework where user will define what of hardware configuration required for his workload, this CRD/mechanism will filter out all the worker nodes where it will find right match to his/her workload profile.

This approach will help not only filtering the hosts, but also will help implement many more classification filters/algorithms which will make users life easy in future.

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 26, 2020
@detiber
Copy link
Member

detiber commented Feb 27, 2020

@digambar15 I'm trying to understand the requirements/UX you are looking for here.

At a high level, I would say that this feels very much analogous to node feature discovery, however that would only be helpful in the context of a single workload cluster.

We could potentially replicate the labels that node-feature-discovery adds to Nodes to the Machines/MachinePools that we manage in Cluster API, but that still feels like an awkward UX to me for the end user (query all Machines in Clusters that I have access to and filter on these labels), just to find the right Cluster to target and then replicate those labels as a Node Selector on the deployment that I create.

I feel like to achieve a good user experience around this, we need to target at least a portion of the functionality towards either developer tooling or an application management layer. With that being the case, it is not too much of a stretch to also have that layer do the Machine -> Node mapping for the node feature detection labels rather than relying on Cluster API to ensure that the labels are in sync on a Machine resource.

@vincepri
Copy link
Member

vincepri commented Mar 2, 2020

/milestone Next
/kind design
/priority awaiting-more-evidence

@k8s-ci-robot k8s-ci-robot added this to the Next milestone Mar 2, 2020
@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Mar 2, 2020
@digambar15
Copy link
Author

digambar15 commented Mar 12, 2020

Hey @detiber, @vincepri,

Sorry for late reply.

As an Operator or User, I have some kind of workloads / application needs very specific hardware configuration on my cluster. Find a right matching host in cluster will be headache and on top of this, we will have to go and check whether we have matched hardware configuration available on that host or not.
Agree, this idea is relying on cluster-api but not needed any modification in cluster-api directly.
but looking at future requirements, we need some kind of classification framework to filter out hosts for this kind scenarios. I am thinking in this cases, we should fetch only unowned hosts under MachineSet and label them for specific hardware/workload profile.
In future, we can add many classification filter like -
Minimum, RAMFilter, DiskFilter, CPUFilter, GPUFilter etc

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 10, 2020
@randomvariable
Copy link
Member

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 17, 2020
@vincepri vincepri removed kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. labels Oct 22, 2020
@vincepri
Copy link
Member

Closing this in favor of #3150

This information might be exposed in the future, and folks should be able to use it outside of Cluster API

@vincepri
Copy link
Member

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Projects
None yet
Development

No branches or pull requests

6 participants