Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatefulSet's VolumeClaimTemplates ownerReference cannot be patched from apiVersion v1beta1 to v1 #2192

Closed
sebgl opened this issue Nov 29, 2019 · 4 comments · Fixed by #2217
Assignees
Labels
>bug Something isn't working v1.0.0

Comments

@sebgl
Copy link
Contributor

sebgl commented Nov 29, 2019

There's a problem with StatefulSets updates when upgrading from ECK using crd v1beta1 to ECK using crd v1 and existing clusters.

To reproduce:

  • use the Elasticsearch CRD v1beta1 (only)
  • deploy an Elasticsearch cluster
  • update to use CRD v1 by applying the new operator manifests
  • the reconciliation attempts to update the existing StatefulSets because their VolumeClaimTemplates ownerReference has changed from
 "volumeClaimTemplates": [
    {
      "metadata": {
        "creationTimestamp": null,
        "name": "elasticsearch-data",
        "ownerReferences": [
          {
            "apiVersion": "elasticsearch.k8s.elastic.co/v1beta1",
            "blockOwnerDeletion": false,
            "controller": true,
            "kind": "Elasticsearch",
            "name": "elasticsearch-sample",
            "uid": "7c589a2d-11f5-11ea-8528-42010a840041"
          }
        ]

to:

 "volumeClaimTemplates": [
    {
      "metadata": {
        "creationTimestamp": null,
        "name": "elasticsearch-data",
        "ownerReferences": [
          {
            "apiVersion": "elasticsearch.k8s.elastic.co/v1",   <--------- HERE
            "blockOwnerDeletion": false,
            "controller": true,
            "kind": "Elasticsearch",
            "name": "elasticsearch-sample",
            "uid": "7c589a2d-11f5-11ea-8528-42010a840041"
          }
        ]
  • it returns the following error, since the volumeClaimTemplates section of a StatefulSet cannot be updated:
"Failed to apply spec change: StatefulSet.apps 
\"elasticsearch-sample-es-default\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden"
@sebgl sebgl added >bug Something isn't working v1.0.0 labels Nov 29, 2019
@sebgl sebgl mentioned this issue Nov 29, 2019
@sebgl
Copy link
Contributor Author

sebgl commented Nov 29, 2019

Some ideas to move forward:

  1. Reconsider how we implemented automatic PVC deletion in Remove PersistentVolumeClaims when removing Elasticsearch nodes #1736 to not use ownerReferences.
  2. Keep using v1beta1 in the ownerReferences forever. Relates Existing v1beta1 owner references prevent upgrades to v1 #2191 (comment).
  3. Delete then recreate the StatefulSet while trying to preserve running Pods.

I don't really like any of them.

@pebrc
Copy link
Collaborator

pebrc commented Nov 29, 2019

  1. Reconsider how we implemented automatic PVC deletion in Remove PersistentVolumeClaims when removing Elasticsearch nodes #1736 to not use ownerReferences

If I remember correctly we only introduced this as an optimisation or convenience for users because we did run into issues with re-use of PVCs when recreating clusters with the same name, which is probably mostly a dev issue to begin with. So that sounds like a good option to consider?

  1. Keep using v1beta1 in the ownerReferences forever. Relates Existing v1beta1 owner references prevent upgrades to v1 #2191 (comment).

I am not sure what the implications are here. Does the GC still work if the owner reference is pointing to a version of the resource that does not exist anymore in the api server (because we have upserted it to v1?

@sebgl
Copy link
Contributor Author

sebgl commented Dec 2, 2019

If I remember correctly we only introduced this as an optimisation or convenience for users because we did run into issues with re-use of PVCs when recreating clusters with the same name, which is probably mostly a dev issue to begin with. So that sounds like a good option to consider?

To me the volume auto-removal is a real important feature not only for developers. Happy to re-discuss the need for having it if we think it's necessary (I'm strongly in favor of having it and making it the default). Using ownerReferences was an easy way to solve part of the problem but we can consider other ways of doing it.

I am not sure what the implications are here. Does the GC still work if the owner reference is pointing to a version of the resource that does not exist anymore in the api server (because we have upserted it to v1?

Yes.

@sebgl
Copy link
Contributor Author

sebgl commented Dec 3, 2019

#2191 (comment) seems to converge on ignoring apiVersions in SetControllerReference. Based on that, I feel like hardcoding the VolumeClaimTemplates ownerReference to v1beta1 for StatefulSets would not be that weird.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working v1.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants