CSI: discard claims on GC'd nodes #13264

tgross · 2022-06-06T23:55:12Z

There's no way for the server to send a node unpublish command to the node plugin that's running on a down node. What's becoming clear here is that the CSI specification as written cannot guarantee correct behavior in the case of downed nodes, so what K8s apparently does is just gives up and lets you potentially corrupt your data in the multi-writer volume use case. User expectations seem to agree, so we may need to rework the design here to make it possible to discard claims for lost nodes and just assume that they're never coming back. This is why we built stop_after_client_disconnect but maybe we'll need to enforce that setting for CSI-claiming tasks unless specifically disabled by the user.

We'll alter the claim Unpublish workflow to skip the Node Unpublish RPCs for any node that is marked as lost, disconnected, or nil (GC'd). Just log the fault and send the controller RPCs, if any.
If users want correct behavior on the clients, they'll include stop_after_client_disconnect. We should consider making this the default behavior for allocations that include CSI volume claims, but this might cause us backwards compatibility grief.
In later work as a performance optimization, we should also update the volumewatcher to watch for updates to the Node table. When a node is GC'd or marked as lost, emit a volume reap on any volumes that have claims on that node.

The text was updated successfully, but these errors were encountered:

tgross · 2022-06-08T15:59:25Z

Draft branch for the unpublish workflow changes is csi-discard-claims-on-gcd-nodes but this needs a lot of testing and probably some rework of whether we're snapshotting the state store for the entire CSIVolume.Unpublish RPC.

Update: I've bench-tested this out on a real cluster to my satisfaction. Going to follow up with some automated testing to probe out the edge cases and then open a PR.

github-actions · 2022-10-08T02:35:40Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added type/enhancement theme/storage labels Jun 6, 2022

tgross self-assigned this Jun 6, 2022

tgross mentioned this issue Jun 6, 2022

CSI error releasing volume claims #12346

Closed

tgross added the hcc/cst Admin - internal label Jun 6, 2022

tgross mentioned this issue Jun 8, 2022

CSI: skip node unpublish on GC'd or down nodes #13301

Merged

tgross closed this as completed in #13301 Jun 9, 2022

tgross mentioned this issue Jun 17, 2022

Allocs using CSI stuck in pending after terminating client node #13416

Closed

hc-github-team-nomad-core mentioned this issue Aug 23, 2022

Backport of CSI: skip node unpublish on GC'd or down nodes into release/1.3.x #14240

Merged

github-actions bot locked as resolved and limited conversation to collaborators Oct 8, 2022

tgross added this to Nomad - Community Issues Triage Jun 24, 2024

tgross moved this to Done in Nomad - Community Issues Triage Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSI: discard claims on GC'd nodes #13264

CSI: discard claims on GC'd nodes #13264

tgross commented Jun 6, 2022

tgross commented Jun 8, 2022 •

edited

Loading

github-actions bot commented Oct 8, 2022

CSI: discard claims on GC'd nodes #13264

CSI: discard claims on GC'd nodes #13264

Comments

tgross commented Jun 6, 2022

tgross commented Jun 8, 2022 • edited Loading

github-actions bot commented Oct 8, 2022

tgross commented Jun 8, 2022 •

edited

Loading