Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions #5649

Merged
merged 1 commit into from
Apr 19, 2023

Conversation

morshielt
Copy link
Contributor

Which component does this PR applies to?

cluster-autoscaler

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR enables setting MaxNodeProvisionTime per NodeGroup (it was previously set per cluster).

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Please take a look.

Does this PR introduce a user-facing change?

MaxNodeProvisionTime can now be defined per node group using 'maxnodeprovisiontime' key. The value of 'max-node-provision-time flag' is still used as the default for all node groups, so this change should have no visible impact on the behavior of Cluster Autoscaler.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NA

/assign @kisieland

@k8s-ci-robot
Copy link
Contributor

@morshielt: GitHub didn't allow me to assign the following users: kisieland.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

Which component does this PR applies to?

cluster-autoscaler

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR enables setting MaxNodeProvisionTime per NodeGroup (it was previously set per cluster).

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Please take a look.

Does this PR introduce a user-facing change?

MaxNodeProvisionTime can now be defined per node group using 'maxnodeprovisiontime' key. The value of 'max-node-provision-time flag' is still used as the default for all node groups, so this change should have no visible impact on the behavior of Cluster Autoscaler.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NA

/assign @kisieland

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Apr 3, 2023
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 3, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: morshielt / name: Maria (38464b7)

@k8s-ci-robot
Copy link
Contributor

Welcome @morshielt!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 3, 2023
@morshielt morshielt marked this pull request as draft April 3, 2023 12:30
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 3, 2023
@morshielt morshielt changed the title Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions [WIP] Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions Apr 3, 2023
@morshielt morshielt force-pushed the csr branch 2 times, most recently from d3a7c8d to 53ec856 Compare April 3, 2023 15:40
@morshielt morshielt marked this pull request as ready for review April 3, 2023 16:37
@morshielt morshielt changed the title [WIP] Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions Apr 3, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 3, 2023
@k8s-ci-robot
Copy link
Contributor

@kisieland: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BigDarkClown
Copy link
Contributor

/assign @BigDarkClown

@BigDarkClown
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 17, 2023
@morshielt
Copy link
Contributor Author

/assign @towca

Copy link
Member

@x13n x13n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to block this PR on this, but leaving 2 comments to think about. If you decide to ignore them, just drop the hold.

/approve
/hold

cluster-autoscaler/main.go Outdated Show resolved Hide resolved
@@ -111,6 +115,7 @@ type ScaleUpFailure struct {
// ClusterStateRegistry is a structure to keep track the current state of the cluster.
type ClusterStateRegistry struct {
sync.Mutex
context *context.AutoscalingContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit sad we need to keep the entire context as a field here. Could we have an object that is just concerned with saying what max node provision time is for a given node (or nodegroup)? The context would then be kept there, making ClusterStateRegistry easier (well, at least not harder) to reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the provider interface to maxNodeProvisionTimeProvider and introduced nodeRegistrationTimeLimitProvider struct containing the provider and context. PTAL :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, although my point was really that whether or not the context is used is an implementation detail and doesn't have to be a part of the interface. So, one could imagine a staticMaxNodeProvisionTimeProvider which just returns a constant duration, as well as more advanced implementations depending on a specific use case. Regardless of what implementation of maxNodeProvisionTimeProvider interface is used, CSR should not require context passed to the constructor (i.e. New... function).

Copy link
Contributor Author

@morshielt morshielt Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation - I've introduced NewDefaultMaxNodeProvisionTimeProvider returning an instance of maxNodeProvisionTimeProvider. It's kept as a field in ClusterStateRegistry and it already contains the context and nodeGroupConfigProcessor when it's assigned, so the NewClusterStateRegistry function signature is mostly back to as it was, it only takes the maxNodeProvisionTimeProvider as an additional argument. Please take a look :)

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 17, 2023
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 17, 2023
@morshielt morshielt requested a review from x13n April 17, 2023 14:18
@morshielt morshielt force-pushed the csr branch 2 times, most recently from 291c7a4 to 14dc76e Compare April 18, 2023 11:02
Copy link
Member

@x13n x13n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good overall, just two comments to consider.

cluster-autoscaler/clusterstate/clusterstate.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2023
@morshielt morshielt force-pushed the csr branch 2 times, most recently from d1c7f07 to ffbfc46 Compare April 19, 2023 08:28
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2023
@x13n
Copy link
Member

x13n commented Apr 19, 2023

Looks good now, thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 19, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BigDarkClown, morshielt, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@morshielt
Copy link
Contributor Author

/unhold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants