-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cluster-autoscaler][clusterapi] Add support for node autodiscovery to clusterapi provider #3314
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need to update README for node autodiscovery configuration
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller.go
Outdated
Show resolved
Hide resolved
d2b26da
to
9c2f0a4
Compare
nice Jason! i know we talked about the node discovery stuff, but i'm a little unfamiliar with the process. i'll give this a test drive. |
@elmiko The main goal was to try to leverage the existing functionality of node autodiscovery to be able to limit which MachineSets/MachineDeployments are allowed to be used for NodeGroups. If they are not specified this falls back to the previous behavior of allowing all discovered MachineSets and MachineDeployments to be used, but we may want to reconsider that at somepoint in the future. |
thanks @detiber, i appreciate the explanation =) |
Close + reopen to trigger travis rerun after fixing tests at HEAD. |
Does this need to be rebased to drop the the commits from 3312? |
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jason, this is looking great so far. i have a couple questions inline, but i really appreciate the thoroughness of the tests, it makes understanding this change much easier for me.
i am still doing some tests locally, but i broke my test stand when upgrading clusterapi so i probably won't finish my tests till monday.
@@ -80,7 +86,7 @@ func indexMachineByProviderID(obj interface{}) ([]string, error) { | |||
return nil, nil | |||
} | |||
|
|||
providerID, found, err := unstructured.NestedString(u.Object, "spec", "providerID") | |||
providerID, found, err := unstructured.NestedString(u.UnstructuredContent(), "spec", "providerID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a question, is this change related to the autodiscovery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly, I think I originally hit an issue while building out the tests where u.Object
was nil, and it seemed like it would be good to guard against that in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to break this change out separately if you prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i'm torn. on one hand it makes it easier to follow the unstructured stuff if we keep it separate, and on the other this is a small change to pick up a few more references.
honestly, my question was just to increase my understanding of the code here. i guess i'm ok to leave it here if no one else has issues.
|
||
if found && replicas > 0 { | ||
nodegroups = append(nodegroups, ng) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm curious about the removal of this check. i know we have other checks to ensure that we don't go over the node group size, but does this change mean we will be returning all node groups even if they are full?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size check has been pushed down into newNodeGroupFromScalableResource()
, which will return nil
if it does not have capacity to scale or has 0 replicas.
var results []*unstructured.Unstructured | ||
tracker := map[string]bool{} | ||
for _, spec := range c.autoDiscoverySpecs { | ||
resources, err := listResources(lister.ByNamespace(spec.namespace), spec.clusterName, spec.labelSelector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not overly familiar with the client-go lister stuff, but i'm having trouble understanding something about the namespace here. if the user doesn't provide a namespace in the autodiscovery clause will this pass nil
to the lister.ByNameSpace
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec.namespace is a string, so it will default to "" when not set, which is equivalent to all namespaces when passed to a lister. For additional context, if one uses metav1.NamespaceAll
, it is defined as the empty string: https://godoc.org/k8s.io/apimachinery/pkg/apis/meta/v1#pkg-constants
return fmt.Errorf("size increase too large - desired:%d max:%d", nreplicas, r.maxSize) | ||
case nreplicas < r.minSize: | ||
return fmt.Errorf("size decrease too large - desired:%d min:%d", nreplicas, r.minSize) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like moving the size check here, ++
hey @detiber , i finally got my test stand working again and gave this PR a drive today. everything seemed to work as expected. i tested the namespace and label options, attempting both inclusion and exclusion. my one comment is that the node group autodiscovery stuff took me a few to understand that the constraints its looking for are on the MachineSet/MachineDeployment resources. i think you mention scalable resources in the readme, but i wonder if we should be more specific? /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
[cluster-autoscaler][clusterapi] Add support for node autodiscovery to clusterapi provider
Builds on #3312