-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No easy way how to update CSI driver that uses fuse #70013
Comments
@jsafrane I think that is the responsibility of the driver, in my opinion. Although we could definitely provide guidelines, we shouldn't provide a solution, since we cannot control their release or update processes. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Expect there is a common solution for this issue. One workaround: https://github.com/AliyunContainerService/csi-plugin/blob/master/docs/oss-upgrade.md |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Available solution from Alibaba Cloud OSS plugin: Another solution: Buddy daemonset pods(two daemonset pods work as buddy) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
I hit similar issue, make fuse driver standalone is one solution; kubernetes/pkg/volume/csi/csi_attacher.go Lines 231 to 234 in e4a5012
is it possible only return error when it's not kubernetes/pkg/volume/flexvolume/detacher.go Lines 60 to 61 in e4a5012
So in that case, even fuse driver is broken, we could make sure if fuse csi driver has remount logic, it could recover after fuse driver is back. |
I worked out a PR(#88569) to mitigate this issue when fuse driver on the node is restarted(mount point is corrupted), could someone take a look? Thanks. |
/open |
/reopen |
@Ark-kun: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@andyzhangx: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@jsafrane: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We implement a solution - transferring fds in csi pods with surging rolling update. |
@jim3ma do you have the details of the solution? thanks. |
Currently, this solution is using in Ant Group, we will open source in Dragonfly Image Service some times later. Another solution: csi container starts a fuse session in another container in host not in pod, then we can update csi pod with keeping a fuse session containers. this likes the solution from Alibaba Cloud OSS plugin but within our control. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Rotten issues close after 30d of inactivity. Send feedback to sig-contributor-experience at kubernetes/community. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
did you open source already? |
@xhejtman Azure Blob CSI driver uses similar fuse proxy solution, and it's open source, check details here: https://github.com/kubernetes-sigs/blob-csi-driver/tree/master/deploy/blobfuse-proxy |
Thank. I was also interested in truly restartable fuse driver, at least I understand what @jim3ma talk about - track everything fuse needs for restart. |
We are trying to merge some code into upstream fuse driver to make this solution perfect in some corner cases. And someday we will make our solution open source with a real project. |
@jim3ma Is it open source now? |
We recommend to use DaemonSet to run CSI drivers on node. If a driver runs fuse daemon, it's almost impossible to update it, as killing a pod with the driver kills the fuse daemons too and it will kill all mounts, possibly corrupting application data.
We need a documented and supported way how to update such CSI drivers. Note that the update process can be manual or the code can live somewhere else, we just need it to to be documented and supported so people don't loose data.
/sig storage
@msau42 @davidz627 @saad-ali @pohly @vladimirvivien @verult @lpabon @jingxu97 @gnufied
The text was updated successfully, but these errors were encountered: