-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify staging and target paths for blocks #335
Clarify staging and target paths for blocks #335
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome 👍
d638e75
to
af47f95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Approve
related to #287 |
I'm not crazy about the specificity of this PR. Though I understand that it does help to clarify things for SP authors. I'm hopeful that we can move to something more generic (in a non-breaking way) once we support something like what's been proposed in #96, where the CO has a bit more control over mount operations. |
@jdef I'm trying to understand the problems that the solution proposed in that issue is trying to solve. The discussion seemed to argue for what we have now, which is node plugins that do the mounts themselves. The only specific use case I could find (sshfs) strongly favors the current design. What would be gained by modifying the publish interface to offer another way? |
@bswartz this is related to another discussion where some one proposed potentially allowing CO to control the final mount. See #96 (comment) That would be an additive change we can consider in the future (post v1). I believe @jdef is arguing that would be a way to introduce the flexibility that is potentially lost with this PR. |
@bswartz AFAICT the proposal in #96 would push CSI in the direction of the CO executing the Current thinking is that we could probably make backwards compatible changes to CSI, post-v1, such that the The specificity of this PR, as it's currently written, cuts both ways: it probably simplifies things for most plugins that will be written against CSI v1.0.0; it also dictates bits of implementation and I think that's where the CO owners have expressed concern. Ideally, we'd avoid over-specifying implementation in the CSI spec as long as the interface between plugin and CO is clear (which is also the point of contention here, right?). From the community discussion today, I'm not 100% convinced that this PR is the best, long-term approach. |
@saad-ali I'm just trying to figure out why anyone would want to do that. What's the advantage of allowing the CO to control the final mount vs the status quo? Also I don't see how the version of the spec prior to my change makes it any easier to do that. The existing spec provides NO mechanism for the SP to return anything to the CO on publish, so it's implied that something like what we current do is what MUST be done. In the future we can add an optional mechanism to return something that the CO could use to perform the mount, but SPs could never assume the CO would want to use that mechanism and therefore would have to support the current model of mounting inside the SP as the default. This language change doesn't make an addition like that any more difficult. |
spec.md
Outdated
// serving the request has `read` and `write` permissions to the path. | ||
// For volumes with an access type of block, the path MUST be a file. | ||
// For volumes with an access type of mount, the path MUST be a | ||
// directory. The SP SHALL bind mount the volume to the target path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really have to say "bind mount" here? Isn't it enough to say "mount"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdef The problem I think many people have is that the word "mount" traditionally implies what happens when you add a new filesystem into the VFS at some directory. For block volumes, we've opted to actually bind mount a single device file. There are lots of alternatives we could have chosen, including symlinks or returning a string. Presumably there is some advantage to the bind mount approach (I wasn't around when this decision was made) or perhaps the spec was left so vague that Kubernetes just chose to use bind mounts rather than the other alternatives and that became the de-facto way to do it. This kind of open-endedness is poison for specifications because it makes it very easy for 2 different implementation that both conform to the spec to not interoperate with eachother.
I could change the wording to make it clear that the bind mount is only a requirement for block access types, and that an ordinary mount is fine for mount (filesystem) access types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I understand where you're coming from. Again, this is where the proposal gets a bit too specific w/ it's requirements of a plugin. I'd like to see just a tiny bit of flexibility here.
Can you replace "SHALL bind mount" with "SHOULD bind mount"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW there's multiple ways to interpret "bind mount" so I'm not sure that there's too much actually gained by keeping the word "bind".
Fun read: https://unix.stackexchange.com/questions/198590/what-is-a-bind-mount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still having a hard time getting behind this change request for NodePublishVolume
. If a plugin doesn't support node-stage and simply wants to mknod
a block device at target_path
then this proposal blocks that behavior. Forcing a plugin to mknod
elsewhere and then to "bind mount" it to a file that the CO is creating here seems like a mistake.
I think we should keep more neutral wording here. The CO SHALL NOT create a file system object at target_path
. Plugin implementation recommendations should be made elsewhere in the spec and should avoid requirement wording like SHALL, and MUST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word SHOULD implies to me that you can not do something and then there are still some scenarios where that can lead to a good result. SPs must do something at the target path to ensure a good result. We owe it to SP implementors to either enumerate all the valid things an SP can do or describe the end result in such a way that at least a few of the valid approaches are obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with you on the fact that there are valid alternatives that can work. Doing a mknod at the target path makes perfect sense to me, but then we have to change the wording to say the CO MUST NOT create anything there, which is is the opposite of what Kubernetes has done and will require changing Kubernetes again (not impossible, just annoying).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way on this, @saad-ali do you feel strongly? The one outcome I don't want is to leave it unspecified which side should create the file/device because then a de-facto standard of "whatever works with kubernetes" will emerge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight preference to leave the spec lose enough for alternate implementations.
We owe it to SP implementors to either enumerate all the valid things an SP can do or describe the end result in such a way that at least a few of the valid approaches are obvious.
I'd prefer the later.
To be clear, I don't think we lose flexibility to do something for #96 with the introduction of this PR. I was trying to say that we still have an escape hatch if we find out (as I think we will) that this PR is too restrictive for all the use cases that we want to support. |
Let's stay focused on this PR. We can discuss the merits of #96 on that issue. |
@jdef Okay I think I understand. As long as we agree that this PR doesn't make the problem you want to solve any harder. The staging path stuff shouldn't affect that future enhancement because it's really just a clarification of what SPs are already depending on, and doesn't impose a new burden on COs. The publish target path stuff could all be overridden by a hypothetical future enhancement to the node publish call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jdef that bind mount
is too specific, and is not always correct. For example, in the case where the plugin is mounting a share from a remote NAS, and does not choose to implement NodeStageVolume, there is no requirement to use a bind mount. So I'd like to see bind mount
replaced with mount
or some additional qualification to make it apparent that we only require bind mounts for block access.
@julian-hj For filesystem (mount access type) volumes I absolutely agree, and will change the wording. For block access type volumes, I'm okay loosening the language as long as it's still clear what the CO should or should not create, so it's clear what's left for the SP to do. |
@lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimuram will need to modify his kubernetes patch to accomodate that. |
You’re using the wrong tag
… On Nov 14, 2018, at 5:50 PM, Ben Swartzlander ***@***.***> wrote:
@lpabon I'm also curious about your opinion on the CO not creating the "file" that gets bind mounted and instead requiring the SP to create a file before bind mounting the path, if it wants to bind mount rather than doing something like mknod. In this case we could simply require that the CO ensures the parent directory exists. I can't imagine any problems with that approach, but @mkimura will need to modify his kubernetes patch to accomodate that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@mkimuram not |
Apologies for misspelling the name |
If you are talking about |
@mkimuram SPs only need to do something different if they create different things at the publish stage. If they just create a mounted directory or a bind mounted file, then blindly calling umount on the target path would be relatively safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @bswartz!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
The vague wording around the staging and target paths left open too many possible implementations for raw block volumes, and implementors ended up doing non-obvious things. This change clarifies the contract between the CO and the SP when staging and publishing raw block volumes on the node. Specifically we now require that the CO creates the staging_target_path and that it is always a directory. And we require that the SP create the target_path and it should be a device file for blocks and a mount point for filesystems.
9fd37a6
to
6620cb6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// block device at target_path. | ||
// For volumes with an access type of mount, the SP SHALL place the | ||
// mounted directory at target_path. | ||
// Creation of target_path is the responsibility of the SP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing needs to be changed here, just leaving a note.
Since the spec never discuss the relationship between volume states (NODE_READY
, VOL_READY
and PUBLISHED
) and node reboots, it could be argued that a PUBLISHED
volume could be in one of either states after a reboot:
PUBLISHED
since noNodeUnpublishVolume
has been called. In this case, the CO might callNodeUnpublishVolume
->NodeUnstageVolume
->NodeStageVolume
->NodePublishVolume
to recover the volume. As a result, the plugin must be able to handle the first two calls when the mount point is missing in the mount table.NODE_READY
since the mount table has been reset after reboot. In this case, the CO might directly callNodeStageVolume
andNodePublishVolume
to recover the volume. As a result, the plugin must be able to deal with the fact thattarget_path
might have been created by itself in a previous run.
cc @jdef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for leaving this comment. At some point I think we should consider adding a "Notes for plugin writers" section to capture things along these lines. If people aren't comfortable adding to the spec.md file then we could add a separate "notes-for-plugin-writers.md" file or something that gives additional context, examples, and other things to consider when writing plugins.
The vague wording around the staging and target paths left open too
many possible implementations for raw block volumes, and implementors
ended up doing non-obvious things. This change clarifies the contract
between the CO and the SP when staging and publishing raw block volumes
on the node.