Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: respect externally managed annotation on unmanaged MachinePools #2588

Conversation

mweibel
Copy link
Contributor

@mweibel mweibel commented Aug 22, 2022

What type of PR is this?
/kind feature

What this PR does / why we need it:
This adds support for marking MachinePool resources as externally managed. Similar to #2444 but for unmanaged clusters. It does not rely on a property in AzureMachinePool since, unlike in AzureManagedMachinePool`, this property does not exist.

Special notes for your reviewer:
Please note that this is just a Draft PR in order to get some feedback on the approach. I reused code and methodology from #2444 in the hope that it matches the existing code style.
I did not yet add tests since I just added the code and I'm trying it on a local cluster.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

adds the ability to annotate a MachinePool with `cluster.x-k8s.io/replicas-managed-by-autoscaler` to synchronize VMSS capacity with MachinePool replicas automatically.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 22, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @mweibel. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mweibel mweibel force-pushed the feat-externally-managed-replicas-unmanaged branch from 6af0b84 to 46e860c Compare August 23, 2022 19:21
@CecileRobertMichon
Copy link
Contributor

#2444 got merged so you should be able to reuse the code from it now if you rebase

See also kubernetes-sigs/cluster-api#7107 which is relevant / in progress

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2022
@CecileRobertMichon
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 23, 2022
@mweibel mweibel force-pushed the feat-externally-managed-replicas-unmanaged branch from 46e860c to 9b325d8 Compare August 25, 2022 06:44
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2022
@mweibel mweibel marked this pull request as ready for review August 25, 2022 06:45
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 25, 2022
@k8s-ci-robot k8s-ci-robot requested a review from Jont828 August 25, 2022 06:45
@mweibel
Copy link
Contributor Author

mweibel commented Aug 25, 2022

See also kubernetes-sigs/cluster-api#7107 which is relevant / in progress

yes I saw that, thanks. Once that PR is merged and released, we can adjust CAPZ code to use the const from CAPI.

@mweibel mweibel force-pushed the feat-externally-managed-replicas-unmanaged branch 5 times, most recently from 959efc4 to f7ecf82 Compare August 25, 2022 12:28
@mweibel
Copy link
Contributor Author

mweibel commented Aug 25, 2022

I tested using a separate empty commit and the tests ran through. I squashed them together and now there's again a conflict in the e2e test.

What can I do about this?

@CecileRobertMichon
Copy link
Contributor

/retest

the failure above looks like a flake

@CecileRobertMichon
Copy link
Contributor

/cc @jackfrancis

@jackfrancis
Copy link
Contributor

@mweibel this is an interesting use-case. So the idea is that you might want to use cluster-autoscaler on your cluster, but not leverage the Cluster API provider, and instead use the Azure provider (which updates replicas directly via the Azure VMSS API)?

Or are there other use cases you're thinking of?

I'm wondering if there's a better, general workflow to indicate this rather than requiring the user to manually add a well-known annotation to the corresponding capi MachinePool resource?

@CecileRobertMichon
Copy link
Contributor

/lifecyle active

@mweibel
Copy link
Contributor Author

mweibel commented Nov 9, 2022

@mweibel this is an interesting use-case. So the idea is that you might want to use cluster-autoscaler on your cluster, but not leverage the Cluster API provider, and instead use the Azure provider (which updates replicas directly via the Azure VMSS API)?

yes exactly. I contributed the same to CAPA already. The use case we have currently is that the cluster-api autoscaler does not yet have enough functionality to replace the native azure one. Crucial features like taints/labels and separate cool down times etc. are not available yet and will probably not for a little while (especially the taints since that's onlyin a proposal how to manage it in CAPI itself).

I'm wondering if there's a better, general workflow to indicate this rather than requiring the user to manually add a well-known annotation to the corresponding capi MachinePool resource?

I settled for an annotation since there was discussion on CAPA about this (annotations vs. a separate field) and they preferred an annotation. It also kind of fits together with the changes in kubernetes-sigs/cluster-api#7107.

@CecileRobertMichon
Copy link
Contributor

/milestone v1.7

@k8s-ci-robot k8s-ci-robot added this to the v1.7 milestone Dec 1, 2022
@jackfrancis
Copy link
Contributor

/assign

@jackfrancis
Copy link
Contributor

/assign @dthorsen

@mweibel mweibel force-pushed the feat-externally-managed-replicas-unmanaged branch from d78f9ce to 66a3529 Compare December 2, 2022 08:55
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Dec 2, 2022

@mweibel: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-azure-e2e-exp 46e860c link false /test pull-cluster-api-provider-azure-e2e-exp

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jackfrancis
Copy link
Contributor

@mweibel this lgtm, I think we need to rebase this PR on top of latest main to overcome the E2E flake

Thanks again for this, let's try to land this before mid-next week so it can be included in the 1.7. release!


// HasReplicasExternallyManaged returns true if the externally managed annotation is set on the CAPI MachinePool resource.
func (m *MachinePoolScope) HasReplicasExternallyManaged(ctx context.Context) bool {
return m.MachinePool.Annotations[azure.ReplicasManagedByAutoscalerAnnotation] == "true"
Copy link
Contributor

@CecileRobertMichon CecileRobertMichon Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis do we need a PR to get rid of this CAPZ specific annotation in favor of the CAPI one that you added?

kubernetes-sigs/cluster-api#7107

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should even be able to get rid of this function altogether and directly use the ReplicasManagedByExternalAutoscaler() helper from CAPI in ReconcileReplicas below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR, yes, replacing that func w/ the CAPI one should work. For AzureManaged, we'll need to incorporate some back-compat w/ the capz annotation (I can do that as a separate PR).

@CecileRobertMichon
Copy link
Contributor

I think we need to rebase this PR on top of latest main to overcome the E2E flake

Prow does the rebase when running tests

/retest

@CecileRobertMichon
Copy link
Contributor

CecileRobertMichon commented Jan 7, 2023

as discussed in https://kubernetes.slack.com/archives/CEX9HENG7/p1673034488923059, let's merge this and refactor to use the CAPI helper function in a separate PR: #2993

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 7, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2023
@k8s-ci-robot k8s-ci-robot merged commit 158ec48 into kubernetes-sigs:main Jan 7, 2023
@mweibel mweibel deleted the feat-externally-managed-replicas-unmanaged branch March 21, 2023 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants