-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart of kubelet leads to duplicated mount entries #1007
Comments
seems like neither our NodeStage nor NodePublish are idempotent. NodePublish not being idempotent is known issue #955, NodeStage I don't understand why yet but will repro and try. |
/assign nirmalaagash |
@dguendisch Not able to reproduce the issue. Can you give me some more details |
I'm happy to detail out, but from the comment I don't know which part you are not able to reproduce? So you applied all steps from my reproduction details and don't see a duplicate mount or what exactly do you mean @nirmalaagash ? |
@dguendisch I tried implementing all the steps that you mentioned in the issue but I am not able to see a duplicate mount.
Driver version:
Deployed the web.yaml with one replica. Shelled into the node where the volume is attached.
Restarted the kubelet in the same node and tried running the command. [After this step, I tried restarting the kubelet in the master too and tried running the command in the worker node.]
Let me know if I missed any of the steps you performed. |
Are you sure that you're actually using csi? Your mount location |
@dguendisch it seems like the version v1.1.0 that you're using doesn't have this fix. Would you be able to re-check with a newer driver version, like v1.1.1? |
I just tried with State before kubelet restart:
kubelet restart triggered:
State afterwards:
csi-driver logs with `v5`
|
Regarding As for |
@dguendisch I am able to reproduce the issue now. Working on it #1019. |
/kind bug
What happened?
Having a pod (from a statefulset) with a pv running on some node, when I restart kubelet, the mountpoints are suddenly duplicated.
I could reproduce this 100% of the time with aws-ebs-csi-driver 1.1.0 (independent of k8s version).
No mountpoints are duplicated e.g. on gcp or azure (with their resp. csi drivers).
What you expected to happen?
No duplicated mountpoints upon kubelet restarts.
How to reproduce it (as minimally and precisely as possible)?
systemctl restart kubelet
)Anything else we need to know?:
These duplicated entries lead to all kind of weird problems, i.e. terminating the resp. pod will never finish because kubelet will trigger unmount, csi will successfully unmount (but this unmounts only one of the duplicated entries, the other pair remains and "leaks" therefore), kubelet triggers detach, subsequently kubelet's
GetDeviceMountRefs check
will fail so pod termination won't proceed => if user now force-deletes the pod, next time the volume won't attach atnvme1n1
but atnvme1n2
, however kubelet now reuses the stale globalmount (which now points to a non-existing device)... ...Environment
kubectl version
): 1.20.9 (but is independent, tried also with 1.21.3 with same effect)cc @timuthy @vpnachev @ialidzhikov
The text was updated successfully, but these errors were encountered: