-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage: Allow disabling of kOps's management of StorageClasses #10733
Storage: Allow disabling of kOps's management of StorageClasses #10733
Conversation
Note that running hack/update-expected.sh changed 109 files. I didn't want to do that, but several integration tests failed until I allowed it to do its thing. |
@@ -15,7 +15,7 @@ kind: StorageClass | |||
metadata: | |||
name: gp2 | |||
annotations: | |||
storageclass.beta.kubernetes.io/is-default-class: "false" | |||
storageclass.kubernetes.io/is-default-class: "false" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be mistaken but I think the reason I left the old annotations was because k8s was reading both still? As in, not setting storageclass.beta.kubernetes.io/is-default-class: "false"
on the gp2
class and trying to just set storageclass.kubernetes.io/is-default-class: "false"
caused the same issues with which one ended up being the default. Would have to test to confirm though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply should remove all unset annotations.
} | ||
|
||
if c.ManageStorageClasses == nil { | ||
c.ManageStorageClasses = fi.Bool(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure you really need this file since you need to use WithDefaultBool
in the templates anyway. And you can set true
as the default value there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Since CloudConfig
isn't set, you need this. I ran into nil pointer issues in my PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change will break backwards compatbility with openstack. We cannot now update existing clusters because it says
spec.cloudConfig.manageStorageClasses: Forbidden: Management of storage classes and OpenStack block storage classes are both specified but disagree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
This doesn't really break BC per se. It only breaks when you have explicitly disabled storageclass on OpenStack.
If we don't set this, it will break for pretty much everyone else. The error message seems clear on what actions users need to take.
Alternative is to ignore this field for OpenStack and document as such, but that may be more confusing for users? Could potentially lead to misconfigurations.
But open to suggestions how to fix this. I suggest at minimum a clear "Actions needed" in the release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did migration script for our clusters which runs kops set cluster spec.cloudConfig.manageStorageClasses=false
after that it works fine. In my opinion it is pretty easy step? okay kops set is still behind featureflag? Anyways it works when executing that. So if people has disabled storageclass in openstack, they need to set manageStorageClasses=false, otherwise they cannot run kops update cluster or other commands
@@ -42,6 +42,7 @@ contents: | | |||
- --authorization-mode=AlwaysAllow | |||
- --bind-address=0.0.0.0 | |||
- --client-ca-file=/srv/kubernetes/ca.crt | |||
- --cloud-config=/etc/kubernetes/cloud.config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These do look very strange, but we make some funny assumptions about spec.cloudConfig
. I guess at some point the intention was that this flag only was for configuring the cloud.config
file, but then got (ab)used for other things later.
I think we always write cloud.config
though, so it should be safe to always reference this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That change came by way of running hack/update-expected.sh, so it wasn't a conscious decision of mine to add the flag.
/lgtm Leaving it unapproved in case someone else also wants to review. |
Another potential design we could consider here to allow more control over StorageClass objects in the future: introduce a "storageClasses" field with an "enabled" field within. It would look like this: spec:
cloudConfig:
storageClasses:
enabled: true If we had that, we could later add finer-grained control over whether and which class gets nominated as the cluster default. Do you think that would be worth it? Also, as noted in Slack, I found that when I populated my new field, all the ASG launch templates got updated, inducing machine replacement at the next kops rolling-update cluster invocation. That seems like too much churn, given that we shouldn't need to replace any machines to add a few StorageClass objects. Is that indicating that maybe "spec.cloudConfig" isn't the right place for this new field? |
The roll happens because the kube-apiserver flags change. And that happens because If you want to fix this, we need a better check around when to add the cloud-config flag. |
I saw all the launch templates change—even for worker machines that don't host the Kubernetes API server—because the "spec.cloudConfig" field's value makes it into the conf/cluster_spec.yaml file included in the user data. |
People don't change this too often though. So I am fine with leaving it. We have other options that triggers unecessary rolls, and we have changes that should trigger rolls that don't. Not ideal, but not easy to solve now. |
Yes, that's a good point. This isn't something we're likely to toggle more than once. |
ca04fd6
to
d44612c
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: olemarkus, seh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…igin-release-1.20 Automated cherry pick of #10733: Define "ManageStorageClasses" cloud config field
Introduce a new "manageStorageClasses" field in the cloud provider portion of the Cluster specification to allow operators to opt out of kOps creating and reconciling StorageClass objects. The default value for the new "spec.cloudConfig.manageStorageClasses" field is true, to maintain compatibility with the existing behavior. When set to false, kOps will no longer create any StorageClass objects. Any such objects that kOps created in the past are left as is, and kOps and will no longer reconcile them against future changes.
The existing "spec.cloudConfig.openstack.blockStorage.createStorageClass" field remains in place. However, if both that and the new "spec.cloudConfig.manageStorageClasses" field are populated, they must agree: It is invalid both to disable management of StorageClass objects globally but to enable them for OpenStack and, conversely, to enable management globally but disable it for OpenStack.
Please see this discussion in the "kops-users" channel of the "Kubernetes" Slack workspace for the negotiation that led to this change. @zetaab, take note.
This change is