-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodegroup Interface for per-nodegroup configuration #3583
Comments
I've put some more thoughts on why I prefer cloudprovider-based implementation over configmap: #2989 (comment). |
I'd like to pick this up and implement it for 1.21. Here is my proposal for how to implement this:
Note that design allows customization at NodeGroup level and not at the level of individual nodes. I think this is a good thing for the reasons mentioned in #2989 (comment). Appendix: per-nodegroup options This is a tentative list of options that I think makes sense to be specified at NodeGroup level. The list is subject to change during implementation (if it turns out the logic of the particular option can't be easily changed to apply it per-nodegroup, etc).
There are a few other settings that would be fairly easy to change to per-nodegroup, but I'm not convinced it makes conceptual sense (ex. IgnoreDaemonSetsUtilization), but I'm not yet sure if they make conceptual sense on NodeGroup level. I will look at them and extend this list as needed. Another set of fields to potentially migrate are the ones related to how often a node is checked for scale-down (UnremovableNodeRecheckTimeout, etc). This is quite easy to set per-nodegroup, however, setting a value to low for one NodeGroup can starve scale-down in other groups, so I'm not sure if we want to allow that. |
this generally makes sense to me, i think the processor interface and split between the autoscaling options and per-node group options seems reasonable. the main question i am left with is how will users be able to associate a set of options with a specific node group? |
The idea is to leave this part up to each cloudprovider. This proposal is basically just a framework that will allow providers to expose the feature to users in whatever way makes most sense for the given provider. The rationale is that the way in which you configure CA is already very provider specific:
Instead each provider can choose to add those additional settings to their existing way of configuring NodeGroups. For example in case of AWS it could be via ASG tags similarly to how various scale-from-0 settings are configured (or not :) the decision on this is with AWS owners), with GCE it could be included in autodiscovery specs, etc. I'm not sufficiently familiar with other providers to make good suggestions. I tried to make the processor flexible enough to hopefully fit each provider's use-case. |
thanks @MaciekPytel , that makes good sense to me. i am wondering how users will interact with these mechanism too, would each provider be responsible for exposing flags or were you thinking about storing the configuration in some other way? |
Each provider would be responsible for exposing a way to configure the feature to users and implementing NodeGroup.GetOptions() (or a completely different implementation of processor if needed). All the default implementation will do is poll NodeGroup.GetOptions() each loop and apply any provided overrides. For a given provider doing nothing will result in no change in behavior - all NodeGroups will use default settings. At any point any provider interested in supporting the feature can add their implementation (this is the same pattern as scale-from-0). |
i think it would be useful if we could have a standard methodology for providers to reference their configuration data, for example a command line flag to a yaml file or something. just to help contain the possible matrix of options that providers could choose. but aside from that, i don't have an objection to what you are describing. thanks for the extra detail =) |
Specifically for cluster api - wouldn't something like an annotation on MD or MS be the most convenient way of configuring those settings? I'm not saying it's the right way to do it, I'm just looking for your opinion. Anyway, the 'framework' changes above are completely independent from any mechanism for actually setting the values, so I think it's safe to implement the mechanics first and later add a flag if we decide we want it. |
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
I've opened #3789 with the implementation. |
yeah, adding an annotation to MD or MS would be a very straightforward way to do this from the capi perspective and it would give users an easy way to interact as well.
agreed, makes sense to me! |
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
This is the first step of implementing kubernetes#3583 (comment). New method was added to cloudprovider interface. All existing providers were updated with a no-op stub implementation that will result in no behavior change. The config values specified per NodeGroup are not yet applied.
This is the implementation of kubernetes#3583 (comment).
Forking this off a thread in Slack.
Motivation behind this is to have some of the CA parameters operate on the nodegroup level. For example, have a different
scale-down-utilization-threshold
for every nodegroup instead of a cluster-wide setting.From @marwanad:
From @MaciekPytel:
The text was updated successfully, but these errors were encountered: