Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement GetOptions for GCE #4236

Merged
merged 1 commit into from
Aug 23, 2021

Conversation

bpineau
Copy link
Contributor

@bpineau bpineau commented Aug 3, 2021

Support per-MIG (scaledown) settings as permited by the
cloudprovider's interface GetOptions() method.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 3, 2021
@bpineau
Copy link
Contributor Author

bpineau commented Aug 3, 2021

Maciej, would you mind having a look? This is entirely standing on your amazing "per Nodegroups configs" work!
/assign @MaciekPytel

return nil, err
}

if opt, ok := getFloat64Option(options, template.Name, "scaledownutilizationthreshold"); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These strings occur in other cloud provider PRs as well. Is it possible to make them global constants?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the way to pass the config is different between providers do we really see value in shared string consts? I can easily see providers striving to make them consistent with the rest of config for given provider (ex. use the same prefix), so we can't guarantee they would be portable anyway.

That being said I would also prefer if the strings were consts instead (whether defined at provider level or as something like DefaultScaleDownUtilizationThresholdKey next to NodeGroupAutoscalingOptions definition).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have in-code constants that the author defined (I assume) that happen to be exactly the same as on other cloud providers. We could try to push it more towards a standard.
But if you prefer local constants then I'm fine with this solution as well. It will still be better than magical strings hidden in code :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made that shared constants (we don't often have opportunities to enforce some consistency ;), and changed AWS and Azure PRs to draft until this gets in (so I'll refresh them to pick those symbols + feedback from this PR).

return make(map[string]string), nil
}

return parseKeyValueListToMap(optionsAsString)
Copy link
Contributor

@MaciekPytel MaciekPytel Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider logging the values obtained from template (the actual parsed value). Knowing when the config changes and what the new value is seems useful for debugging.
However, we should only do this if we refresh rarely enough (see my other comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added a log entry as part of the refactoring suggested in your other comment.

if err != nil {
return nil, err
}
return m.templates.BuildAutoscalingOptions(template, defaults)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be called here directly:

  • GetOptions is similar to MinSize() and MaxSize(): CA code calls it at any point in the loop and we generally assume that:
    • The value returned should not change during CA loop.
    • The call is "free" from performance point of view (ie. no API calls should happen as a result).

The latter is probably already true due to caching in migInstanceTemplatesProvider, but I think the way to update any provider level settings should be to fetch it in Refresh() call and only return cached value here. In this case I propose to move extractAutoscalingOptionsFromKubeEnv and associated logic into forceRefresh() call and only apply the options to defaults in GetMigOptions.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as suggested

@bpineau bpineau force-pushed the autoscaling-options-gce branch 2 times, most recently from a5183ad to 84f5dcb Compare August 19, 2021 13:38
@bpineau
Copy link
Contributor Author

bpineau commented Aug 19, 2021

@mwielgus and @MaciekPytel Thanks for the reviews; updated following the suggestion + re-tested, please PTAL.

}
kubeEnvValue, err := getKubeEnvValueFromTemplateMetadata(template)
if err != nil {
klog.Warningf("Failed to extract KubeEnv from %q instance template's metadata: %v", template.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is a very generic warning (we get a lot of stuff from kube env in template). I like the ones above more, as they make it clear where the error is coming from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, updated to make it clear this comes from extracting autoscaling options.

@MaciekPytel
Copy link
Contributor

/lgtm
/approve
/hold
Left hold for a minor nit. Please feel free to remove it if you disagree or fix.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 20, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bpineau, MaciekPytel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2021
Support per-MIG (scaledown) settings as permited by the
cloudprovider's interface `GetOptions()` method.
@bpineau bpineau force-pushed the autoscaling-options-gce branch from 84f5dcb to d905ec2 Compare August 21, 2021 16:19
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 21, 2021
@MaciekPytel
Copy link
Contributor

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Aug 23, 2021
@k8s-ci-robot k8s-ci-robot merged commit d09b893 into kubernetes:master Aug 23, 2021
akirillov pushed a commit to airbnb/autoscaler that referenced this pull request Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants