NodePublishDevice #122

akutz · 2017-10-18T17:53:10Z

This patch handles issue #119 by adding two new RPCs, NodePublishDevice and NodeUnpublishDevice. These RPCs MUST be called by the CO if the Node Plugin advertises the PUBLISH_UNPUBLISH_DEVICE capability. Plugins that advertise this capability SHOULD defer volume reference counting to the CO.

cc @saad-ali @jieyu

jdef · 2017-10-18T18:13:19Z

Thanks for the PR!

As we add more optional calls we're going to get to the point where different CO's are going to require different subsets of optional calls in order to run plugins. For example, maybe Mesos requires optional call ListVolumes for some types of plugins, or k8s requires NodePublishDevice for all plugins.

CSI is supposed to ease the pain (for vendors) of developing plugins that are cross-CO compatible. I'm concerned that as we add more optional calls, we'll end up working against the compatibility promise that we started with.

Is there a good reason that this shouldn't be a REQUIRED API?

akutz · 2017-10-18T18:19:39Z

Hi @jdef,

Is there a good reason that this shouldn't be a REQUIRED API?

Because we agreed that plug-ins may want to manage their own reference counting and that this should be an optional RPC.

As for optional or required -- that's the purpose of the capabilities. If the API should also define a required workflow, then the spec should call that out. I was not sold on the value of this RPC due to concerns of fragmentation. However, now that it's decided I strongly disagree that there should be any debate regarding the use of Capabilities to indicate what RPCs are supported. This undermines the entire purpose of capability advertisement.

saad-ali

Thanks for the quick turn around. A few comments.

saad-ali · 2017-10-18T18:23:21Z

spec.md

+This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
+The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
+This RPC MUST be called by the CO once per node, per volume.
+If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE`, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.


Also:
The CO MUST guarantee that this RPC is called and returns a success before any NodePublishVolume is called for the given volume on the given node.

Hi @saad-ali,

Done.

saad-ali · 2017-10-18T18:26:06Z

spec.md

+  // absolute path in the root filesystem of the process serving this
+  // request. The CO SHALL ensure uniqueness of mount_path per volume.
+  // This is a REQUIRED field.
+  string mount_path = 4;


Maybe call this global_mount_path to mentally disambiguate it from the workload mount_path. And possibly renmae the mount_path in NodePublishVolumeRequest to workload_mount_path? Also global_mount_path should be passed in on NodePublishVolumeRequest as well.

Maybe call this global_mount_path

I find this problematic as it conveys more than a single volume. How about changing them to mount_path and bind_mount_path?

Also global_mount_path should be passed in on NodePublishVolumeRequest as well.

Absolutely. That was a miss on my part. This is the largest change I've made to the spec up to this point and figured I'd miss something. Thank you for catching this!

I find this problematic as it conveys more than a single volume. How about changing them to mount_path and bind_mount_path?

Are you proposing

NodePublishDeviceRequest mount_path -> mount_path (no change)

NodePublishVolumeRequest mount_path -> bind_mount_path

That would make sense if all volume plugins implement volume NodePublishDevice, but for volumes that don't, bind_mount_path won't make sense.

I'd prefer either

NodePublishDeviceRequest(global_mount_path)

NodePublishVolumeRequest(global_mount_path, mount_path)

Or:

NodePublishDeviceRequest(device_mount_path)

NodePublishVolumeRequest(device_mount_path, mount_path)

Or:

NodePublishDeviceRequest(device_mount_path)

NodePublishVolumeRequest(device_mount_path, workload_mount_path)

Verbosity is ugly, I agree, but in this case I think it's worth 1) differentiating between the workload path and the device path, and 2) having an obvious link between the "global/device" mount_path passed to NodePublishDeviceRequest and NodePublishVolumeRequest.

Hi @saad-ali,

FWIW, I already updated to global_target_path and left the existing target_path alone and was about to push the change when your comment notification arrived. Are you okay with this? I prefer minimal changes where possible, and this seems sufficient.

Hi @saad-ali,

I just pushed global_target_path. Obviously it can be changed again as this discussion progresses.

saad-ali · 2017-10-18T18:29:23Z

spec.md

+
+  // End user credentials used to authenticate/authorize node publish
+  // request. This field is OPTIONAL.
+  Credentials user_credentials = 6;


I would argue against having Credentials in the NodePublishDeviceRequest. This call is called once per volume per node, not once per volume per workload. Imagine the case where two workloads belonging to two different users where one user has permission and the other does not. Since NodePublishDeviceRequest is only called once per volume per node, it should not be checking permissions, and instead the NodePublishVolume should check user credentials.

Hi @saad-ali,

This part confuses me. To me the credentials are more useful for doing the initial device mount in NodePubishDevice than the bind mount in NodePublishVolume, but maybe I misunderstand the purpose of the credentials?

Think of it this way, a single device can be referenced by multiple workloads, but not all workloads may have equal permissions to access the volume.

The credentials passed should authenticate/authorize a given workload (or it's user) access to that volume. However, since NodePubishDevice is only called once per node per volume, it will be triggered for the first workload referencing that volume on that node, but not for subsequent workloads. Therefore, the credentials passed to NodePubishDevice can't be workload or user specific credentials, because they won't apply to every workload/user.

To prevent inconsistent usage, I think we should eliminate Credentials from NodePubishDevice and only have it in the call that is guaranteed to be called for every workload, which is NodePublishVolume.

That said, taking a step back, the same argument can be applied to ControllerPublishVolume, since that is only called once per volume per plugin as well. We should fix that too.

Alternative, if we really want to keep credentials in every call, we should change the language, in the case of ControllerPublishVolume and NodePubishDevice, to make it clear that in those two cases the credentials cannot be End user credentials. But if we do that, we should clarify what the purpose of the credentials is in those cases.

To prevent inconsistent usage, I think we should eliminate Credentials from NodePubishDevice and only have it in the call that is guaranteed to be called for every workload, which is NodePublishVolume.

I agree w/ @saad-ali on this point. Credentials were intended to be workload specific. Unclear how they fit here.

Yep. I just need to rebase them out. I'm working on something else ATM but will remove them in the next hour or so.

Any thoughts on this?

saad-ali · 2017-10-18T18:36:29Z

Is there a good reason that this shouldn't be a REQUIRED API?

I don't have a strong preference either way. But, if you want to simplify, we could collapse the controller PUBLISH_UNPUBLISH_VOLUME and node PUBLISH_UNPUBLISH_DEVICE capabilities into a single REMOTE_PUBLISH_UNPUBLISH capability in the Identity service that can control both ControllerPublishVolume and NodePublishDevice? Doing so would require introducing a GetCapabilities in the Identity service, however.

akutz · 2017-10-18T18:53:38Z

Hi @saad-ali,

Maybe I'm missing something, but how does that work?. You can have a scenario where a Controller Plugin supports PUBLISH_UNPUBLISH_VOLUME and a Node plugin doesn't support PUBLISH_UNPUBLISH_DEVICE. In fact, that's the scenario for all plug-ins written up to this point.

saad-ali · 2017-10-18T19:20:07Z

Maybe I'm missing something, but how does that work?. You can have a scenario where a Controller Plugin supports PUBLISH_UNPUBLISH_VOLUME and a Node plugin doesn't support PUBLISH_UNPUBLISH_DEVICE. In fact, that's the scenario for all plug-ins written up to this point.

The argument here would be that plugins that support PUBLISH_UNPUBLISH_VOLUME must also support PUBLISH_UNPUBLISH_DEVICE. Doing so would achieve the simplification @jdef was suggesting would be lost. Again, I don't feel strongly about this either way.

akutz · 2017-10-18T19:22:20Z

Hi @saad-ali,

The argument here would be that plugins that support PUBLISH_UNPUBLISH_VOLUME must also support PUBLISH_UNPUBLISH_DEVICE. Doing so would achieve the simplification @jdef was suggesting would be lost. Again, I don't feel strongly about this either way.

In this case I think simple isn't better since the two RPCs are not bound together -- it's easy to imagine a case for one without the other. I will leave it as it is.

saad-ali · 2017-10-20T16:55:09Z

In this case I think simple isn't better since the two RPCs are not bound together -- it's easy to imagine a case for one without the other. I will leave it as it is.

I'm fine with that. @jdef?

jdef · 2017-10-20T18:01:26Z

To be clear, I wasn't suggesting that we somehow merge ControllerPublishVolume and NodePublishDevice to achieve simplicity. Nor was I suggesting that we push non-Identity things into the Identity service.

I was thinking along the lines of simply making NodePublishDevice a required call (just like NodePublishVolume). The goal for publishing a device seems more related to the requirement of a CO (k8s and how it wants to manage volume lifecycle) than a capability that a plugin would/would-not offer. We don't have a way to represent CO capabilities/requirements. Making NodePublishDevice required seems simpler to me, because it makes consistent the expectations placed upon a CO.

For example, if a plugin doesn't offer a PUBLISH_UNPUBLISH_DEVICE capability, how is k8s (or some other CO w/ similar goals) supposed to interact with that plugin? Vendors are basically REQUIRED to implement that capability for compatibility with k8s. Mesos has similar issues because the current design requires GET_CAPACITY for plugins that will offer volumes that feed into the Resource cycle. Mesos will (eventually) offer ways to integrate plugins that don't support GET_CAPACITY. Are there any k8s plans to support plugins that don't implement PUBLISH_UNPUBLISH_DEVICE?

If a plugin DOES offer the PUBLISH_UNPUBLISH_DEVICE capability but a CO doesn't need it -- it basically forces the CO down the "required" implementation path anyway because it becomes a call that the CO is REQUIRED (by the spec) to invoke prior to NodePublishVolume. Furthermore it forces such a CO to cook up unique global_target_path fields (in order to remain spec compliant).

I can't imagine that many vendors will want to ignore k8s. So NodePublishVolume basically becomes de facto REQUIRED. Why not model it that way from the start?

The concept of publishing a device prior to publishing a volume for a workload feels like it has significant impact to volume lifecycle and CO implementation. I dislike the idea of crafting an option that is, in effect, required anyway: such an approach both unnecessarily complicates the spec and creates confusion.

Am I misunderstanding something fundamental to the discussion?

akutz · 2017-10-20T18:07:04Z

Hi @jdef,

CSI isn't beholden to K8s more than any other CO. The fact is some COs may not wish to own reference counting. This is a case of adding a feature, not taking one away. Why remove a capability when adding one? I don't see the necessity in doing so.

Use case wise I can see where a CSI proxy may wish to advertise that it owns reference counting in order to perform some additional actions as part of the workflow.

jdef · 2017-10-20T18:11:18Z

OK, I hadn't thought of the use case of a CSI proxy. How does that change if NodePublishDevice becomes required vs. optional?

…

On Fri, Oct 20, 2017 at 2:07 PM, Schley Andrew Kutz < ***@***.***> wrote: Hi @jdef <https://github.com/jdef>, CSI isn't beholden to K8s more than any other CO. The fact is some COs may not wish to own reference counting. This is a case of adding a feature, not taking one away. Why remove a capability when adding one? I don't see the necessity in doing so. Use case wise I can see where a CSI proxy may wish to advertise that it owns reference counting in order to perform some additional actions as part of the workflow. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#122 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACPVLJve5dt47zVr20zC9OkBrxWWUKhiks5suOFJgaJpZM4P-G4s> .

akutz · 2017-10-20T18:17:41Z

The proxy may not wish the CO to participate in any reference counting and thus indicates to the CO that NodePublishDevice is not supported, even if the plug-ins to which the middleware is communicates do support ref counting.

Ultimately though I'm concerned that such a significant change like making this a required part of CSI's primary workflow would occur after the item was discussed at the CSI community meeting. This is like when riders are added to bills that have already been reviewed, but before the floor votes. It simply smells wrong.

@jdef If you want to make this a required RPC then I would like the rest of the community to have a chance to participate in this discussion. Your proposed addendum alters this PR in such a substantial manner that it would no longer be the same change approved by the community during this week's meeting.

saad-ali · 2017-10-20T20:54:36Z

PUBLISH_UNPUBLISH_DEVICE is not required by Kubernetes. Kubernetes would work fine if a volume plugin did not implement it. Consider the NFS plugin case, a PublishDevice step makes no sense in that case. So I would be against making this a requirement instead of optional.

saad-ali · 2017-10-20T16:58:38Z

spec.md

@@ -945,23 +1076,29 @@ message NodePublishVolumeRequest {
  // this capability. This is an OPTIONAL field.
  PublishVolumeInfo publish_volume_info = 3;

+  // The path to which the device was mounted by `NodePublishDevice`.


It MUST be passed in if the node plugin implements the PUBLISH_UNPUBLISH_DEVICE node capability.
It MUST not be left empty if the the node plugin does not implement the PUBLISH_UNPUBLISH_DEVICE node capability.

I agree credentials are not required in here nor do I think they should be included in ControllerPublish. IMO commands that are to be called once and not once per workload do not need to include the user/workload credentials. Otherwise it opens up consistency issues.

saad-ali · 2017-10-20T16:59:03Z

spec.md

+This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
+The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
+This RPC MUST be called by the CO once per node, per volume.
+If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE`, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.


saad-ali · 2017-10-20T20:55:51Z

spec.md

+
+  // End user credentials used to authenticate/authorize node publish
+  // request. This field is OPTIONAL.
+  Credentials user_credentials = 6;


Any thoughts on this?

jdef · 2017-10-20T22:30:53Z

Thanks for clarifying @saad-ali, sounds like I made a bad assumption. I'll withdraw my objection: we can keep it as optional.

I wonder if it's worth considering wrapping the global_target_path in a DevicePublishInfo message that we could easily expand later as needed, for passing along information from NodePublishDeviceResponse to NodePublishVolumeRequest. Thoughts?

akutz · 2017-10-20T23:33:18Z

I like that idea James. Then again, I also liked keeping the volume handle the way it was :)

jieyu · 2017-10-23T20:24:36Z

Thanks for the PR @akutz.

I am wondering what's the relationship between PUBLISH_UNPUBLISH_DEVICE capability and AccessMode? In fact, we should clarify that when we allow NodePublishVolume to be called multiple times on a node. For instance, does SINGLE_NODE_WRITER mean that NodePublishVolume can only be called once? or NodePublishDevice can only be called once? or ControllerPublishVolume can only be called once?

akutz · 2017-10-23T20:35:46Z

Hi @jieyu,

NodePublishDevice should be called once per volume, per node, regardless of AccessMode. The bind mount plug-ins will perform in NodePublishVolume is what ultimately determines how the AccessMode is handled. COs should not be accessing the filesystem at the global_target_path that the volume is mounted during NodePublishDevice.

This patch handles issue container-storage-interface#119 by adding two new RPCs, "NodePublishDevice" and "NodeUnpublishDevice". These RPCs MUST be called by the CO if the Node Plugin advertises the "PUBLISH_UNPUBLISH_DEVICE" capability. Plugins that advertise this capability SHOULD defer volume reference counting to the CO.

akutz · 2017-10-24T18:12:21Z

Hi All,

I believe I've rebased this PR to account for everyone's notes. @jdef, I'm opting to leave global_target_path its own field for now as planning ahead with the spec has often proven to be a fool's errand given the number of recent modifications. And if things do change, that's the purpose of the API version.

@jieyu, I am wondering if your concern regarding AccessMode shouldn't result in this change to NodePublishDeviceRequest:

message NodePublishDeviceRequest {
  // The API version assumed by the CO. This is a REQUIRED field.
  Version version = 1;

  // The ID of the volume to publish. This field is REQUIRED.
  string volume_id = 2;

  // The CO SHALL set this field to the value returned by
  // `ControllerPublishVolume` if the corresponding Controller Plugin
  // has `PUBLISH_UNPUBLISH_VOLUME` controller capability, and SHALL be
  // left unset if the corresponding Controller Plugin does not have
  // this capability. This is an OPTIONAL field.
  PublishVolumeInfo publish_volume_info = 3;

  // The path to which the volume will be published. It MUST be an
  // absolute path in the root filesystem of the process serving this
  // request. The CO SHALL ensure uniqueness of global_target_path per
  // volume.
  // This is a REQUIRED field.
  string global_target_path = 4;

  // The capability of the volume the CO expects the volume to have.
  // This is a REQUIRED field.
  // Specifies what API the volume will be accessed using. One of the
  // following fields MUST be specified.
  oneof access_type {
    BlockVolume block = 5;
    MountVolume mount = 6;
  }
}

As you indicated, NodePublishDevicehas no relationship to the requested AccessMode, and therefore the only real information that is required with respect to the VolumeCapability is the access_type.

jdef · 2017-10-26T22:47:13Z

Whichever PR lands first, we should probably consider adding volume_attributes to NodePublishDevice at some point #123

jdef · 2018-02-22T17:07:04Z

I think this was superseded by #169, right? If so we can probably close this out

jdef · 2018-02-26T15:36:10Z

Closing this out as per my previous comment. If I'm wrong, please re-open.

saad-ali reviewed Oct 18, 2017

View reviewed changes

akutz force-pushed the feature/node-publish-device branch from 59b4f8d to e5b4dbc Compare October 18, 2017 19:11

saad-ali reviewed Oct 20, 2017

View reviewed changes

akutz force-pushed the feature/node-publish-device branch from e5b4dbc to 4df353c Compare October 23, 2017 17:49

akutz force-pushed the feature/node-publish-device branch from 4df353c to 9fcfd8c Compare October 24, 2017 18:07

jdef mentioned this pull request Oct 30, 2017

Consider changing NodePublish to NodePrepare #96

Open

jdef added this to the v0.1 milestone Oct 31, 2017

jieyu removed this from the v0.1 milestone Nov 15, 2017

akutz mentioned this pull request Dec 1, 2017

spec: Clarify calling multiple NodePublishVolume #150

Merged

This was referenced Jan 12, 2018

Consider a "MountDevice" equivalent step #119

Closed

NodeStageVolume/NodeUnstageVolume #169

Merged

jdef closed this Feb 26, 2018

NodePublishDevice #122

NodePublishDevice #122

Conversation

akutz commented Oct 18, 2017

jdef commented Oct 18, 2017

akutz commented Oct 18, 2017 • edited Loading

saad-ali left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saad-ali commented Oct 18, 2017 • edited Loading

akutz commented Oct 18, 2017

saad-ali commented Oct 18, 2017

akutz commented Oct 18, 2017

saad-ali commented Oct 20, 2017

jdef commented Oct 20, 2017

akutz commented Oct 20, 2017

jdef commented Oct 20, 2017 via email

akutz commented Oct 20, 2017

saad-ali commented Oct 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdef commented Oct 20, 2017

akutz commented Oct 20, 2017

jieyu commented Oct 23, 2017

akutz commented Oct 23, 2017

akutz commented Oct 24, 2017

jdef commented Oct 26, 2017

jdef commented Feb 22, 2018

jdef commented Feb 26, 2018

akutz commented Oct 18, 2017 •

edited

Loading

saad-ali commented Oct 18, 2017 •

edited

Loading