-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAPA applies AWS cloud-controller-manager tag to user-owned subnet, but does not delete the tag on cluster delete, leaving the subnet unusable by future clusters #3854
Comments
Fixing this bug means changing tagging behavior that's been present since v1.0. The possible solutions: (a) CAPA stops applying the CCM tag to user-owned subnets If we cannot do either (a) or (b) in a minor release, then I think we need to address this issue before the v2.0.0. |
Thanks for creating this for discussion @dlipovetsky. My thoughts are: (a) we can't stop applying the tag to user-owned subnets because this was a specific feature request (#2584, from @scottslowe). Reading the original request, this seems like a reasonable feature to have imho. I think that (b) is the way to go out of the 2 options.
If we go with either option, then we are not changing the API and so from that point of view changing in a minor version is OK. The question is then are we changing the behaviour in a non-backwards compatible way? I think as long as we have a way to determine that we added the CCM tag and only act on delete when we see we added the tag, then i think the behaviour stays the same for existing clusters but changes for newly created clusters. If we then start adding/deleting ccm tags on sg and route tables this is new behaviour which we add in minor versions. That's a long way of saying that I think we can do (b) in a minor version bump, so v2.1.0 What does everyone else think? |
And from the discussions on slack (https://kubernetes.slack.com/archives/CD6U2V71N/p1668469376275609 and https://kubernetes.slack.com/archives/CD6U2V71N/p1668587787725479):
If we reverted the behaviour of adding the tags to the subnets then this is a change in behaviour that was requested in #2584. |
If we did want to remove the tagging of user managed resources we could introduce this via a feature flag? |
Even I am not in favor of option (a) because of the same reason listed by @richardcase. So option (b) seems to be the way forward to me as well.
Since this is a supported behavior, IMO we should not change it at this point. And we have seen users benefitting from this, so -1 from me. |
I was unaware of the previous user request to support this. While I’d still prefer if user provided infra was wholly set up by the user, I can see it is beneficial to others already so I agree to go with option B. |
@MarcusNoble @Ankitasw - if we had an option (c) which was to gather details of usage of this current behaviour from people like @scottslowe and then consider option (a) in the future - if we give notice of deprecation of this behaviour to give time for users to change their usage and update any automation. Would this change your thoughts? |
Just catching up on the issue. Regarding the PR that added CCM tags to BYOI subnets, I implemented them per a user request and it was only done for subnets per doc, meaning there were no mention of security groups or other resources. Agree with @dlipovetsky - we should have deleted the tags CAPA added to the subnets - it was one of my early PRs in CAPA so haven't thought about it too carefully. Now, regarding options suggested
I am ok with either options as long as the pattern of applying/removing CCM tags are consistent across resources. The only thing is that I know some of our customers rely on (a) right now so we will need to give advance warning/time for them. |
catching up on the thread, I think (b) is reasonable, I don't think we can consider it as an API breaking change so I think 2.1.0 would make sense |
Also just catching up (I'm on holiday in Spain): Since my request seems to be the culprit that started a portion of this, I figured I should weigh in. 😄 The use case for me was (and is) that I use an infrastructure-as-code tool to stand up a set of shared infrastructure (VPC, subnets, route tables, gateways, etc.) that are then used/leveraged by multiple workload clusters. It's possible that I could modify my IaC code to better handle the tags, but it would be far better if CAPA could manage the tags as part of cluster lifecycle. For me, therefore, option B (removing tags on cluster delete) would be preferable/ideal. I'm happy to provide additional information if needed. |
Thanks all for your thoughtful comments! It sounds like we agree to go with
And we agree that the remaining work () can be addressed in a 2.x minor release.
I will create GH issues for these. Update: I created #3860 for route tables, and amended #3481 for security groups. |
@AverageMarcus I recognize your concern. But I'm not sure that's what users expect. For example, for every LoadBalancer-type Service, the CCM itself creates a rule in the user-owned security group. Users don't expect to create these rules on behalf of the CCM. Moreover, without the CCM tag applied to user-owned resources, the CCM will fail to function. Since the CCM tag is specific to the cluster, it should be applied on cluster create, and removed on cluster delete. While we can make the user responsible for these two actions, it seems like CAPA is in a good position to make them on behalf of the user. I would like to rely on the "principle of least surprise" here: Let's document why CAPA applies the CCM tag, namely, to ensure CCM can do its job with user-owned resources, and why CAPA removes this tag on cluster delete. |
/triage accepted |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/kind bug
What steps did you take and what happened:
The cloud-controller-manager (CCM) can only use tagged resources; if they're missing, it will not create the ELB to back the LoadBalancer-type Service.
The tag, which I'll refer to as the "CCM tag," is:
kubernetes.io/cluster/<name>: [owned|shared]
. Importantly, a resource may have exactly one such tag. Otherwise, the CCM will return an error.CAPA applies the CCM tag to various resources that it owns. That's fine. However, with some recent changes on the main branch, CAPA applies this tag to some resources that the user owns. These resources outlive the cluster, and CAPA does not remove the CCM tag on cluster delete. Until the CCM tag is removed from them, these resources can't be used for other clusters. That's a problem.
Part of the reason CAPA does not remove the CCM tag from these resources is because it does not know whether it applied the CCM tag, or whether the user applied it themselves. Without storing some state (e.g. using yet another tag), CAPA cannot know this.
What did you expect to happen:
CAPA should not apply the CCM tag to user-owned resources, or it should delete the CCM tag when reconciling the cluster delete.
Anything else you would like to add:
CAPA begin to apply the CCM tag to a user-owned subnet in #2715.
Environment:
v1.5.1
#3854 (comment)
The text was updated successfully, but these errors were encountered: