-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two PVC-s bound to the same PV #2456
Comments
It's not Flux that managed pods, but Kubernetes deployment and replicaset controllers. PVC should be used with Statefulsets, Kubernetes knows how to handle the rolling upgrade accordingly. If you want to use PVC with Kubernetes Deployments, then set the strategy to replace, this will delete the old pod before creating the new one. |
Thank you Stefan. Sorry that I didn't do enough exploration before opening the issue. |
Hello Stefan, Sorry for bugging you again and re-opening issue. In spite of deployment strategy normal-k8s image upgrade works perfectly fine. rollout strategy + I think that flux is doing smth unusual from standard Could you comment ?
Sorry for bothering you again. |
Same here: using a PVC with a deployment (the combination is unfortunately not under our control) leads to a |
+1 from me for reports of this in the wild. Really happy with Flux generally, thanks for making a cool bit of software! As for this issue specifically: we are also experiencing something similar, immediately after having done a rolling update of our kubernetes cluster.
|
@kingdonb, RE: your suggestion in #2250, you were absolutely correct: Thanks a bunch, this solves our issue completely. @IvanKuchin, if you remove |
I am using local path pvs and I want to prevent pvcs from being bound to them from inside namespaces other than what I configured, that's why I was using claim ref. I don't really see an alternative to my problem though 😕 |
@Robbilie, would a selector not work for some reason? |
I can pin the PVC to a PV (just by setting the volume name, doesn't even need the selector) but I want to enforce only this one PVC being allowed to be bound to the PV. The claim ref is on the PV side so no other PVC is allowed to be bound to it. The volume name/selector on PVC side doesn't prevent this... This way if I allow someone to create pvcs they can simply bind to the PV and write on the local path volume to which they shouldn't have access |
Kubernetes itself enforces that though. From the docs:
The |
Yeah but what if there are two pvcs, isn't that a race condition and it's possible the "malicious" one binds first? |
Well yeah I guess so, but if that's really the situation you've likely got bigger problems than PVC binding. Regardless, @kingdonb, after reviewing the docs, I think Flux's behavior here is still a bug of some sort. According to the docs:
This is essentially the situation that @Robbilie is talking about, and it suggests that the field is not for control plane usage only, so it follows that Flux should be able to handle this situation. That said, removing the |
@jmriebold Thanks for finding that! I did not know, maybe this is an issue we need to handle somehow. You might try the In general the disposition of Flux has changed from "will merge with externally provided configuration" to "in general we override it, unless you specifically opt-in to merging behavior, or if it is from a known cluster actor" Well in this case, it is a known cluster actor. It seems possible that the bad behavior is outside of Flux, if you should be able to set I think someone who is affected by this is going to have to try to figure it out and tell us what needs to change, I haven't found any situations in my own clusters where I need to set up volumes on the same cluster and there are tenants that do not trust each other who all should be able to create volumes from the same pool. I would probably try to separate them a bit harder with a storage class that has a separate configuration per user, and admission controllers to block users from selecting PVs in storage classes that don't belong to them, but this might not be possible with every storage provider. |
@kingdonb thank you for workaround. I'm out of office now, give me please few days to test it out. |
@jmriebold would you expect this to work if we were using the |
@daogilvie are you using both |
Thanks for replying so quickly! We are using only We were using just I'll report back if the selector solves the problem for us or not 👍 |
Just |
Yes, we expected it to work too 😄 |
Using selectors doesn't work for us, but in a new and different way. When the following input is provided to flux in the form of a kustomization overlay (just the relevant PV/PVC shown here): ---
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
app_version: 2022.4.2
labels:
app_name: foyer
generator: kustomize
vol: flower-pv
name: flower-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 100G
claimRef:
namespace: default
gcePersistentDisk:
fsType: ext4
pdName: foyer-pd
storageClassName: ""
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
app_version: 2022.4.2
labels:
app_name: foyer
generator: kustomize
name: flower-pv-claim
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100G
selector:
matchLabels:
vol: flower-pv
storageClassName: ""
volumeName: flower-pv
--- ...we now get the pods and pvc stuck in "Pending", where the pods aren't scheduled onto a node because the claim won't bind, and the claim won't bind because:
And when we go to look at this "already bound" volume itself, it has the status of I had run a Please let me know what information you might want/need here, but we are going to go back to our workaround, which is manually creating the PV/PVC with a direct apply, then having flux manage everything else. |
Now I see what you're doing is not what I expected:
Is your intention here to enforce that the claim must come from the named namespace? I do not think you can count on this behavior to work the way you want, I've never seen it documented to work as you describe. If you were going to specify a |
Hi @kingdonb — thank you for this! I had missed that as it is actually added by a Kustomization config and is not included in the main base yaml that I've been editing. I'll try this whole thing again having taken that out — we found that we needed that for the claimref method to work previously when we were using it as the means of binding. |
Thank you @kingdonb, @jmriebold. Now that I have removed that claimref kustomization, using |
@kingdonb i created a pv and pvc with claimref on the pv and the ssa merge annotation and its lost too :/ |
If your claimRef does not match exactly what controller-manager decides to do with the PVC then it's probably going to mark the PVC as lost. @Robbilie We have a number of reports here from people that had this issue, who were able to resolve it in Flux, but the behavior can be different from one Kubernetes CSI environment to another. Do you have more information about your issue so we can see what's gone wrong? It might be good to open a separate issue, and provide the details there. As much information as you can provide as possible please, what cloud provider and what specific configuration (all YAMLs needed to trigger the issue in a minimal set) causes the volume to be lost. I am skeptical that there is a Flux issue requiring any code change here given how many seem to have resolved it for themselves, but that does not mean we don't have a UX issue worth addressing with a doc that could be super clear about how you should handle PVs and PVCs in Flux when working with stateful workloads. It's worth documenting at least, (and if some CSIs provide a different behavior than others, we should try to capture the important parts for our users in a doc.) |
i dont think anyone got it to work while still keeping the claimRef, no? the solution was to drop the claimRef on the PV so the original issue persists i believe… i am using rke2 on ubuntu 20.04 nodes and these resources:
|
@Robbilie I don't think anyone else needed to set I noticed you set If so, it's not too surprising if your PVs go to Lost phase. PVCs need to be fully defined except for:
... except for There may be something in common between your issue and the original issue reported here, but if you want to follow it up, it's better if you create a new issue. I'm not sure if any CSI requires the user to set I don't want to spam the original poster of this issue, if we're now talking about a different issue. I'm going to close this one, but we can reopen it if the original poster still is dealing with this issue. In the mean time, I'd ask that anyone who is still struggling with this please open up a separate issue, feel free to link back to this issue for context if applicable. |
Describe the bug
Hello team,
Reconciliation process creates new pod before deleting old one. In case of pod has pvc in volume-section that ordering creates double claim to the same pv.
IMO order of operations should be
Steps to reproduce
Easiest way to reproduce is to follow "Automate image updates to Git" guide , with the following addition to podinfo-deployment.yaml.
Step 1) Add PV / PVC and attach volume to pod.
If it is confusing full manifest is here
Change image version to trigger deployment reconciliation
Observe the problem.
PVC will get to Lost state
PV will get to Available state
Reason for that is order of pod update operations
Expected behavior
Successful image update even with PV/PVC attached to the pod
Screenshots and recordings
No response
OS / Distro
20.04.3 LTS (Focal Fossa)
Flux version
flux version 0.27.0
Flux check
$ flux check
► checking prerequisites
✔ Kubernetes 1.22.6-3+7ab10db7034594 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.17.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.22.0
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.16.0
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.20.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.21.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.21.2
✔ all checks passed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: