-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PV.spec.ClaimRef.uid removed from PersistentVolume after reconciliation. #2250
Comments
I set up persistent volumes with Flux in a similar way, but I do not write any claimRef on the persistent volume itself. Instead, I use volumeName on the persistentVolume, and I believe it is the cluster which fills out claimRef when it does the binding. I think it is usually considered an anti-pattern for a Custom Resource to write back to its own spec, but as this is PersistentVolume in the canonical Kubernetes API, I wouldn't think myself able to put up a convincing argument for or against why it is this way. I think this is where I'm supposed to hold the phone up to my ear and accuse you of holding it wrong...
So you can see what I actually include and don't include, here is the example in context, as a source (with the claimRef notably absent from my persistent volume spec, compare to above where the cluster has filled it in),
I didn't find any manual for how to harvest this kind of manifest from an existing cluster, I just guessed at how, and if I remember correctly my first guess was exactly what you did, I remember running into a similar "Lost" behavior although I am using local-path volumes so I'm not sure it's exactly comparable. The (Hope this helps!) |
This thread may also help, it is from someone who specifically did use NFS as I guess you are, and the thread asserts you can and should be able to set claimRef, although I don't know for sure, it is definitely relevant material I'm not sure if it's very helpful: kubernetes/kubernetes#23615 (comment) It's not clear to me if my way is correct or incorrect, but it hasn't failed me so far and I have been testing this approach for a while. |
@kingdonb - Thank you! I'll give your approach a whirl. It seems simpler than what I'm doing, so if it works, I'll be happy to roll with that approach. FWIW, this was working flawlessly for months, so SOMETHING recently changed cause the issue. I didn't notice it right away because they initially show up as "bound" and only become "lost" after flux reconciliation. Regardless, if I'm "holding it wrong" than I'm more than happy to try other approaches! I'm pretty new to all this and am always happy to learn. more. I'll report back with my results. |
@kingdonb - Thank you for the help. Your approach does in fact work, and I think it's cleaner than what I was doing. I'm happy. That said, I DO think that what I'm doing is supposed to work, especially given the other issue you linked above (it sounds like it's basically about getting what I was trying to do working) and my volumes DO stay bound absent flux reconciliation. This is true even if I Thanks again! |
I've observed the exact same behaviour on a similar setup as @a-mcf. The above fix seems to work. I would dare to pinpoint that this behaviour might have been introduced in flux v0.24.1. |
I'd like to reopen the issue so that someone sees it after the holiday, if it remains unexplained. However, due to a bug in GitHub right now I can only close the issue (even though it is already closed!) Glad the issue is resolved for you 💭 👍 |
I opened an issue to follow this up. In writing up #2279 I got somewhat more ambitious; we should explore "what is there to know" about persistent volumes. I would be happy if we had a document that explained just what we covered in this issue. |
If the behavior of Flux changed and it was a regression, let's address it with a separate issue. It can refer to this issue if the report here describes some symptoms that are important to the regression, but I haven't seen any direct evidence of that yet and it's unlikely to get much attention now, if perhaps there's an issue but nobody is blocked on it. |
I'm seeing the same thing since I upgraded from kubernetes 1.21 to 1.22 (without chaning flux version). So I'm guessing this is something that is only present on 1.22+ One thing to note is that in my case, the PVCs are created by helm charts and in most of the cases I haven't found a way to set the volumeName (I know for some of the charts I can specify my own pvc, but in a lot of cases that's not possible). So until now it worked very well in my local env, to have the helm chart create a pvc that I would then just fill with my local pv. |
@tarioch There are different behaviors with server-side apply from K8s 1.20.x to 1.23.x, and we've observed the variability between how Kubernetes behaves from one version to the next, and in turn how Flux behaves around it primarily here: There are a number of related issues that may (should, hopefully will) cause many edge cases to reify and converge again, when the Flux 0.26.0 release is out. Please stay tuned and keep us updated if you notice any difference when that release is out (it should be within a couple of days at most.) As far as, how to add volumeName to a volume that is created by a helmrelease, that is a tricky issue for cases where the Helm chart did not include any The harbor chart is a good model: (https://github.com/goharbor/harbor-helm/blob/73248da3f25b024d27d4f395aacd116fc6770cbe/templates/redis/statefulset.yaml#L86-L104 and https://github.com/goharbor/harbor-helm/blob/73248da3f25b024d27d4f395aacd116fc6770cbe/templates/redis/statefulset.yaml#L61-L70) because it provides this For cases where the chart doesn't help you at all, YSK that Here's the best example I could find of that case in particular: https://github.com/kingdonb/bootstrap-repo/blob/698d43dab8d8a1cc151ac16bc4a9e3c9a87139ae/apps/minio/stage/minio-helmrelease.yaml#L18-L36 So in this case, the volume is an emptydir in the base chart, and I wanted to make it a permanent/persistent volume to use for cluster storage (that I'm using to back my Harbor instance.) I added a named persistent volume claim (you can create this however you want, use one that has been created by helm, etc.) and ensured the PV reclaimPolicy was set to Retain. And that was it. I captured "cleaned up" versions (eg. The details around that workflow in particular are what I was hoping to explain in #2279. Off-the-cuff, that's about what I think the document that closes that issue will have to cover in a bit more depth. |
Tried out with 0.26.0, unfortunately still fails. At the moment I'm no longer managing the PV with flux (by adding kustomize.toolkit.fluxcd.io/reconcile=disabled), will see if I go the route with post renderers (will be a big change as I have to adjust lots and lots of configs). |
Just ran into this same issue after upgrading to EKS 1.22, using Flux v0.29.3. Unfortunately removing the At the moment we're unable to manage these volumes with Flux, because after Flux reconciles them, Kubernetes decides there are multiple PVCs referencing the same PV, and all PVCs change to status "lost", rendering them unusable. EDIT: To add a little more detail, we're doing what's described here under "A Simpler Approach", since we have a relatively small number of static persistent volumes, and don't need to dynamically create them. |
Same issue here, it feels like a regression in flux because this worked without problems when the cluster was running 1.21 but I understand the complications. |
I have tried a number of different CSI providers and all of them seem to permit managing PV and PVC statically without setting In the
In the PV, which I captured through
I have an example up here: https://github.com/kingdonb/github-actions-demo/tree/main/intermission/synology/persistence in this example, Flux applies the PV and PVC alongside to any pods being created with an attempt to bind to a named PVC. It seems to work for me every time. No conflicts, as I'm not setting any fields that kube-controller-manager needs to own. If EFS doesn't support this process for attaching existing persistent volumes, then it must be different. I am not sure because I don't have an EFS implementation that I can test with handy, I only have local-path storage, the synology CSI, and azure/gcp which I have tested all of these, and they seem to behave the same. Sorry you are experiencing an issue here and are apparently blocked on this. If you are not able to work past this on your own, could you open a new issue and describe the process you are running, so we can try to help you through it there? |
Maybe the discussion can continue in #2456 as there seem to be others with similar issues, (please check in there or open a new issue.) |
Thanks for the suggestions @kingdonb. I thought it was required to set owner, but maybe I was wrong. I'll do some testing and report back in the other issue thread. |
Describe the bug
I have several statically provisioned PV/PVCs. When the PV/PVCs are first applied by either flux or via kubectl, the PVCs show up as "Bound" and everything looks good. However whenever flux next reconciles the repository, the logs say that flux detected a change and the PVC is now reported as lost:
After digging around a bit, I figured out that spec.ClaimRef.uid is being removed from the PV. If I manually specify PV.spec.ClaimRef.uid with the value of the uid from the PersistentVolumeClaim it survives reconciliation.
This behavior does NOT occur if I apply the configuration manually with kubectl.
Here is my configuration
Steps to reproduce
flux reconcile
or just wait for it to do it on it's own.Expected behavior
The dynamically added uid should not be removed from
PV.spec.claimRef.uid
when the repository is reconciled.Screenshots and recordings
No response
OS / Distro
Ubuntu 20.04
Flux version
flux: v0.24.1
Flux check
► checking prerequisites
✔ Kubernetes 1.22.5+k3s1 >=1.19.0-0
► checking controllers
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.19.0
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.14.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.18.2
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.19.2
✔ all checks passed
Git provider
GitHub
Container Registry provider
n/a
Additional context
Here is a blurb from the logs. Most of the PVs here have already had the uid removed and their associated PVCs in a "Lost" state which is why they are marked as unchanged. I was working with home, so note the
"PersistentVolume/nfs-homer-pv":"configured"
in the json below.Code of Conduct
The text was updated successfully, but these errors were encountered: