Clarify staging and target paths for blocks #335

bswartz · 2018-11-14T18:42:40Z

The vague wording around the staging and target paths left open too
many possible implementations for raw block volumes, and implementors
ended up doing non-obvious things. This change clarifies the contract
between the CO and the SP when staging and publishing raw block volumes
on the node.

lpabon

Awesome 👍

saad-ali

LGTM
Approve

jdef · 2018-11-14T19:37:58Z

related to #287

jdef · 2018-11-14T19:40:39Z

I'm not crazy about the specificity of this PR. Though I understand that it does help to clarify things for SP authors. I'm hopeful that we can move to something more generic (in a non-breaking way) once we support something like what's been proposed in #96, where the CO has a bit more control over mount operations.

bswartz · 2018-11-14T20:02:49Z

@jdef I'm trying to understand the problems that the solution proposed in that issue is trying to solve. The discussion seemed to argue for what we have now, which is node plugins that do the mounts themselves. The only specific use case I could find (sshfs) strongly favors the current design. What would be gained by modifying the publish interface to offer another way?

saad-ali · 2018-11-14T20:12:48Z

@bswartz this is related to another discussion where some one proposed potentially allowing CO to control the final mount. See #96 (comment)

That would be an additive change we can consider in the future (post v1). I believe @jdef is arguing that would be a way to introduce the flexibility that is potentially lost with this PR.

jdef · 2018-11-14T20:24:10Z

@bswartz AFAICT the proposal in #96 would push CSI in the direction of the CO executing the mount syscall vs. the plugin. Also suggested by the OP of the proposal was that the same mechanism could be extended to support more exotic "mounting" of volumes, such as what might be needed for VM disk images - something that #166 would like to see supported as well.

Current thinking is that we could probably make backwards compatible changes to CSI, post-v1, such that the target_staging_path and target_path mechanism could be the default/fallback, and that we could add a new "capability" somewhere that lets the CO discover plugin support for more exotic/flexible volume "mounting".

The specificity of this PR, as it's currently written, cuts both ways: it probably simplifies things for most plugins that will be written against CSI v1.0.0; it also dictates bits of implementation and I think that's where the CO owners have expressed concern. Ideally, we'd avoid over-specifying implementation in the CSI spec as long as the interface between plugin and CO is clear (which is also the point of contention here, right?). From the community discussion today, I'm not 100% convinced that this PR is the best, long-term approach.

bswartz · 2018-11-14T20:27:07Z

@saad-ali I'm just trying to figure out why anyone would want to do that. What's the advantage of allowing the CO to control the final mount vs the status quo?

Also I don't see how the version of the spec prior to my change makes it any easier to do that. The existing spec provides NO mechanism for the SP to return anything to the CO on publish, so it's implied that something like what we current do is what MUST be done.

In the future we can add an optional mechanism to return something that the CO could use to perform the mount, but SPs could never assume the CO would want to use that mechanism and therefore would have to support the current model of mounting inside the SP as the default. This language change doesn't make an addition like that any more difficult.

spec.md

jdef · 2018-11-14T20:29:02Z

spec.md

+  // serving the request has `read` and `write` permissions to the path.
+  // For volumes with an access type of block, the path MUST be a file.
+  // For volumes with an access type of mount, the path MUST be a
+  // directory. The SP SHALL bind mount the volume to the target path.


do we really have to say "bind mount" here? Isn't it enough to say "mount"?

@jdef The problem I think many people have is that the word "mount" traditionally implies what happens when you add a new filesystem into the VFS at some directory. For block volumes, we've opted to actually bind mount a single device file. There are lots of alternatives we could have chosen, including symlinks or returning a string. Presumably there is some advantage to the bind mount approach (I wasn't around when this decision was made) or perhaps the spec was left so vague that Kubernetes just chose to use bind mounts rather than the other alternatives and that became the de-facto way to do it. This kind of open-endedness is poison for specifications because it makes it very easy for 2 different implementation that both conform to the spec to not interoperate with eachother.

I could change the wording to make it clear that the bind mount is only a requirement for block access types, and that an ordinary mount is fine for mount (filesystem) access types.

OK, I understand where you're coming from. Again, this is where the proposal gets a bit too specific w/ it's requirements of a plugin. I'd like to see just a tiny bit of flexibility here.

Can you replace "SHALL bind mount" with "SHOULD bind mount"?

FWIW there's multiple ways to interpret "bind mount" so I'm not sure that there's too much actually gained by keeping the word "bind".

Fun read: https://unix.stackexchange.com/questions/198590/what-is-a-bind-mount

I'm still having a hard time getting behind this change request for NodePublishVolume. If a plugin doesn't support node-stage and simply wants to mknod a block device at target_path then this proposal blocks that behavior. Forcing a plugin to mknod elsewhere and then to "bind mount" it to a file that the CO is creating here seems like a mistake.

I think we should keep more neutral wording here. The CO SHALL NOT create a file system object at target_path. Plugin implementation recommendations should be made elsewhere in the spec and should avoid requirement wording like SHALL, and MUST.

The word SHOULD implies to me that you can not do something and then there are still some scenarios where that can lead to a good result. SPs must do something at the target path to ensure a good result. We owe it to SP implementors to either enumerate all the valid things an SP can do or describe the end result in such a way that at least a few of the valid approaches are obvious.

I'm with you on the fact that there are valid alternatives that can work. Doing a mknod at the target path makes perfect sense to me, but then we have to change the wording to say the CO MUST NOT create anything there, which is is the opposite of what Kubernetes has done and will require changing Kubernetes again (not impossible, just annoying).

I could go either way on this, @saad-ali do you feel strongly? The one outcome I don't want is to leave it unspecified which side should create the file/device because then a de-facto standard of "whatever works with kubernetes" will emerge.

I have a slight preference to leave the spec lose enough for alternate implementations.

We owe it to SP implementors to either enumerate all the valid things an SP can do or describe the end result in such a way that at least a few of the valid approaches are obvious.

I'd prefer the later.

jdef · 2018-11-14T20:31:41Z

That would be an additive change we can consider in the future (post v1). I believe @jdef is arguing that would be a way to introduce the flexibility that is potentially lost with this PR.

To be clear, I don't think we lose flexibility to do something for #96 with the introduction of this PR. I was trying to say that we still have an escape hatch if we find out (as I think we will) that this PR is too restrictive for all the use cases that we want to support.

saad-ali · 2018-11-14T20:32:31Z

Let's stay focused on this PR. We can discuss the merits of #96 on that issue.

bswartz · 2018-11-14T20:37:22Z

@jdef Okay I think I understand. As long as we agree that this PR doesn't make the problem you want to solve any harder. The staging path stuff shouldn't affect that future enhancement because it's really just a clarification of what SPs are already depending on, and doesn't impose a new burden on COs. The publish target path stuff could all be overridden by a hypothetical future enhancement to the node publish call.

julian-hj

I agree with @jdef that bind mount is too specific, and is not always correct. For example, in the case where the plugin is mounting a share from a remote NAS, and does not choose to implement NodeStageVolume, there is no requirement to use a bind mount. So I'd like to see bind mount replaced with mount or some additional qualification to make it apparent that we only require bind mounts for block access.

bswartz · 2018-11-14T22:33:36Z

@julian-hj For filesystem (mount access type) volumes I absolutely agree, and will change the wording. For block access type volumes, I'm okay loosening the language as long as it's still clear what the CO should or should not create, so it's clear what's left for the SP to do.

bswartz · 2018-11-14T22:49:58Z

@lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimuram will need to modify his kubernetes patch to accomodate that.

mkimura · 2018-11-14T23:02:40Z

You’re using the wrong tag

…

On Nov 14, 2018, at 5:50 PM, Ben Swartzlander ***@***.***> wrote: @lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimura will need to modify his kubernetes patch to accomodate that. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

saad-ali · 2018-11-14T23:08:01Z

@mkimuram not mkimura

bswartz · 2018-11-14T23:11:30Z

Apologies for misspelling the name

mkimuram · 2018-11-14T23:12:29Z

@bswartz @lpabon

I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod.

If you are talking about NodePublishVolume/NodeUnpublishVolume, won't this change affect the assumption that there is no difference in NodeUnpublishVolume logic between block and mount access type? I understand that we originally assume that NodeUnpublishVolume requires just unmounting from target path in both case. However, if NodeUnpublishVolume is required to do something different from unmount, it might need information on which access type is requested, which is not passed to NodeUnpublishVolume, to decide what to do for the path. (The decision to choose other than mount/unmount is up to SP, so it could be at their own risk, though.)

bswartz · 2018-11-14T23:19:55Z

@mkimuram SPs only need to do something different if they create different things at the publish stage. If they just create a mounted directory or a bind mounted file, then blindly calling umount on the target path would be relatively safe.
If on the other hand the stuff that gets created can vary significantly, then the SP would need to detect what got created by calling stat() on the target path at unpublish time or use an external mechanism to remember what kind of volume it is.

jieyu

LGTM! Thanks @bswartz!

saad-ali

/lgtm
/approve

The vague wording around the staging and target paths left open too many possible implementations for raw block volumes, and implementors ended up doing non-obvious things. This change clarifies the contract between the CO and the SP when staging and publishing raw block volumes on the node. Specifically we now require that the CO creates the staging_target_path and that it is always a directory. And we require that the SP create the target_path and it should be a device file for blocks and a mount point for filesystems.

ddebroy

LGTM

chhsia0 · 2018-11-15T19:17:32Z

spec.md

+  // block device at target_path.
+  // For volumes with an access type of mount, the SP SHALL place the
+  // mounted directory at target_path.
+  // Creation of target_path is the responsibility of the SP.


Nothing needs to be changed here, just leaving a note.

Since the spec never discuss the relationship between volume states (NODE_READY, VOL_READY and PUBLISHED) and node reboots, it could be argued that a PUBLISHED volume could be in one of either states after a reboot:

PUBLISHED since no NodeUnpublishVolume has been called. In this case, the CO might call NodeUnpublishVolume -> NodeUnstageVolume -> NodeStageVolume -> NodePublishVolume to recover the volume. As a result, the plugin must be able to handle the first two calls when the mount point is missing in the mount table.

NODE_READY since the mount table has been reset after reboot. In this case, the CO might directly call NodeStageVolume and NodePublishVolume to recover the volume. As a result, the plugin must be able to deal with the fact that target_path might have been created by itself in a previous run.

cc @jdef

Thanks for leaving this comment. At some point I think we should consider adding a "Notes for plugin writers" section to capture things along these lines. If people aren't comfortable adding to the spec.md file then we could add a separate "notes-for-plugin-writers.md" file or something that gives additional context, examples, and other things to consider when writing plugins.

lpabon approved these changes Nov 14, 2018

View reviewed changes

bswartz force-pushed the clarify-target-staging-paths branch 2 times, most recently from d638e75 to af47f95 Compare November 14, 2018 19:05

saad-ali approved these changes Nov 14, 2018

View reviewed changes

jdef reviewed Nov 14, 2018

View reviewed changes

julian-hj suggested changes Nov 14, 2018

View reviewed changes

jieyu approved these changes Nov 15, 2018

View reviewed changes

saad-ali approved these changes Nov 15, 2018

View reviewed changes

bswartz force-pushed the clarify-target-staging-paths branch from 9fd37a6 to 6620cb6 Compare November 15, 2018 00:56

ddebroy approved these changes Nov 15, 2018

View reviewed changes

jdef approved these changes Nov 15, 2018

View reviewed changes

saad-ali merged commit ed0bb0e into container-storage-interface:master Nov 15, 2018

saad-ali mentioned this pull request Nov 15, 2018

Cherry pick changes for 1.0.0 from master #336

Merged

chhsia0 reviewed Nov 15, 2018

View reviewed changes

saad-ali mentioned this pull request Mar 6, 2019

NodePublishVolumeRequest() should define what to do as a driver when given a targetPath for a block volume #287

Closed

bswartz deleted the clarify-target-staging-paths branch December 30, 2019 03:06

carlory mentioned this pull request Jan 24, 2024

Remove deprecated comment about cleanup the target_path of CSI volumes kubernetes/kubernetes#121978

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify staging and target paths for blocks #335

Clarify staging and target paths for blocks #335

bswartz commented Nov 14, 2018

lpabon left a comment

saad-ali left a comment

jdef commented Nov 14, 2018

jdef commented Nov 14, 2018

bswartz commented Nov 14, 2018

saad-ali commented Nov 14, 2018

jdef commented Nov 14, 2018

bswartz commented Nov 14, 2018

jdef Nov 14, 2018

bswartz Nov 14, 2018 •

edited

Loading

jdef Nov 14, 2018

jdef Nov 14, 2018

jdef Nov 14, 2018

bswartz Nov 14, 2018

bswartz Nov 14, 2018

bswartz Nov 14, 2018

saad-ali Nov 14, 2018

jdef commented Nov 14, 2018

saad-ali commented Nov 14, 2018 •

edited

Loading

bswartz commented Nov 14, 2018

julian-hj left a comment

bswartz commented Nov 14, 2018

bswartz commented Nov 14, 2018 •

edited

Loading

mkimura commented Nov 14, 2018 via email

saad-ali commented Nov 14, 2018

bswartz commented Nov 14, 2018

mkimuram commented Nov 14, 2018

bswartz commented Nov 14, 2018

jieyu left a comment

saad-ali left a comment

ddebroy left a comment

chhsia0 Nov 15, 2018

jdef Nov 15, 2018

Clarify staging and target paths for blocks #335

Clarify staging and target paths for blocks #335

Conversation

bswartz commented Nov 14, 2018

lpabon left a comment

Choose a reason for hiding this comment

saad-ali left a comment

Choose a reason for hiding this comment

jdef commented Nov 14, 2018

jdef commented Nov 14, 2018

bswartz commented Nov 14, 2018

saad-ali commented Nov 14, 2018

jdef commented Nov 14, 2018

bswartz commented Nov 14, 2018

Choose a reason for hiding this comment

bswartz Nov 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdef commented Nov 14, 2018

saad-ali commented Nov 14, 2018 • edited Loading

bswartz commented Nov 14, 2018

julian-hj left a comment

Choose a reason for hiding this comment

bswartz commented Nov 14, 2018

bswartz commented Nov 14, 2018 • edited Loading

mkimura commented Nov 14, 2018 via email

saad-ali commented Nov 14, 2018

bswartz commented Nov 14, 2018

mkimuram commented Nov 14, 2018

bswartz commented Nov 14, 2018

jieyu left a comment

Choose a reason for hiding this comment

saad-ali left a comment

Choose a reason for hiding this comment

ddebroy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bswartz Nov 14, 2018 •

edited

Loading

saad-ali commented Nov 14, 2018 •

edited

Loading

bswartz commented Nov 14, 2018 •

edited

Loading