Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodePublishDevice #122

Conversation

akutz
Copy link
Contributor

@akutz akutz commented Oct 18, 2017

This patch handles issue #119 by adding two new RPCs, NodePublishDevice and NodeUnpublishDevice. These RPCs MUST be called by the CO if the Node Plugin advertises the PUBLISH_UNPUBLISH_DEVICE capability. Plugins that advertise this capability SHOULD defer volume reference counting to the CO.

cc @saad-ali @jieyu

@jdef
Copy link
Member

jdef commented Oct 18, 2017

Thanks for the PR!

As we add more optional calls we're going to get to the point where different CO's are going to require different subsets of optional calls in order to run plugins. For example, maybe Mesos requires optional call ListVolumes for some types of plugins, or k8s requires NodePublishDevice for all plugins.

CSI is supposed to ease the pain (for vendors) of developing plugins that are cross-CO compatible. I'm concerned that as we add more optional calls, we'll end up working against the compatibility promise that we started with.

Is there a good reason that this shouldn't be a REQUIRED API?

@akutz
Copy link
Contributor Author

akutz commented Oct 18, 2017

Hi @jdef,

Is there a good reason that this shouldn't be a REQUIRED API?

Because we agreed that plug-ins may want to manage their own reference counting and that this should be an optional RPC.

As for optional or required -- that's the purpose of the capabilities. If the API should also define a required workflow, then the spec should call that out. I was not sold on the value of this RPC due to concerns of fragmentation. However, now that it's decided I strongly disagree that there should be any debate regarding the use of Capabilities to indicate what RPCs are supported. This undermines the entire purpose of capability advertisement.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turn around. A few comments.

This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
This RPC MUST be called by the CO once per node, per volume.
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE`, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also:
The CO MUST guarantee that this RPC is called and returns a success before any NodePublishVolume is called for the given volume on the given node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali,

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

spec.md Outdated
// absolute path in the root filesystem of the process serving this
// request. The CO SHALL ensure uniqueness of mount_path per volume.
// This is a REQUIRED field.
string mount_path = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this global_mount_path to mentally disambiguate it from the workload mount_path. And possibly renmae the mount_path in NodePublishVolumeRequest to workload_mount_path? Also global_mount_path should be passed in on NodePublishVolumeRequest as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this global_mount_path

I find this problematic as it conveys more than a single volume. How about changing them to mount_path and bind_mount_path?

Also global_mount_path should be passed in on NodePublishVolumeRequest as well.

Absolutely. That was a miss on my part. This is the largest change I've made to the spec up to this point and figured I'd miss something. Thank you for catching this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this problematic as it conveys more than a single volume. How about changing them to mount_path and bind_mount_path?

Are you proposing

  • NodePublishDeviceRequest mount_path -> mount_path (no change)
  • NodePublishVolumeRequest mount_path -> bind_mount_path

That would make sense if all volume plugins implement volume NodePublishDevice, but for volumes that don't, bind_mount_path won't make sense.

I'd prefer either

  • NodePublishDeviceRequest(global_mount_path)
  • NodePublishVolumeRequest(global_mount_path, mount_path)

Or:

  • NodePublishDeviceRequest(device_mount_path)
  • NodePublishVolumeRequest(device_mount_path, mount_path)

Or:

  • NodePublishDeviceRequest(device_mount_path)
  • NodePublishVolumeRequest(device_mount_path, workload_mount_path)

Verbosity is ugly, I agree, but in this case I think it's worth 1) differentiating between the workload path and the device path, and 2) having an obvious link between the "global/device" mount_path passed to NodePublishDeviceRequest and NodePublishVolumeRequest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali,

FWIW, I already updated to global_target_path and left the existing target_path alone and was about to push the change when your comment notification arrived. Are you okay with this? I prefer minimal changes where possible, and this seems sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali,

I just pushed global_target_path. Obviously it can be changed again as this discussion progresses.

spec.md Outdated

// End user credentials used to authenticate/authorize node publish
// request. This field is OPTIONAL.
Credentials user_credentials = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue against having Credentials in the NodePublishDeviceRequest. This call is called once per volume per node, not once per volume per workload. Imagine the case where two workloads belonging to two different users where one user has permission and the other does not. Since NodePublishDeviceRequest is only called once per volume per node, it should not be checking permissions, and instead the NodePublishVolume should check user credentials.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @saad-ali,

This part confuses me. To me the credentials are more useful for doing the initial device mount in NodePubishDevice than the bind mount in NodePublishVolume, but maybe I misunderstand the purpose of the credentials?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of it this way, a single device can be referenced by multiple workloads, but not all workloads may have equal permissions to access the volume.

The credentials passed should authenticate/authorize a given workload (or it's user) access to that volume. However, since NodePubishDevice is only called once per node per volume, it will be triggered for the first workload referencing that volume on that node, but not for subsequent workloads. Therefore, the credentials passed to NodePubishDevice can't be workload or user specific credentials, because they won't apply to every workload/user.

To prevent inconsistent usage, I think we should eliminate Credentials from NodePubishDevice and only have it in the call that is guaranteed to be called for every workload, which is NodePublishVolume.

That said, taking a step back, the same argument can be applied to ControllerPublishVolume, since that is only called once per volume per plugin as well. We should fix that too.

Alternative, if we really want to keep credentials in every call, we should change the language, in the case of ControllerPublishVolume and NodePubishDevice, to make it clear that in those two cases the credentials cannot be End user credentials. But if we do that, we should clarify what the purpose of the credentials is in those cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent inconsistent usage, I think we should eliminate Credentials from NodePubishDevice and only have it in the call that is guaranteed to be called for every workload, which is NodePublishVolume.

I agree w/ @saad-ali on this point. Credentials were intended to be workload specific. Unclear how they fit here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I just need to rebase them out. I'm working on something else ATM but will remove them in the next hour or so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on this?

@saad-ali
Copy link
Member

saad-ali commented Oct 18, 2017

Is there a good reason that this shouldn't be a REQUIRED API?

I don't have a strong preference either way. But, if you want to simplify, we could collapse the controller PUBLISH_UNPUBLISH_VOLUME and node PUBLISH_UNPUBLISH_DEVICE capabilities into a single REMOTE_PUBLISH_UNPUBLISH capability in the Identity service that can control both ControllerPublishVolume and NodePublishDevice? Doing so would require introducing a GetCapabilities in the Identity service, however.

@akutz
Copy link
Contributor Author

akutz commented Oct 18, 2017

Hi @saad-ali,

Maybe I'm missing something, but how does that work?. You can have a scenario where a Controller Plugin supports PUBLISH_UNPUBLISH_VOLUME and a Node plugin doesn't support PUBLISH_UNPUBLISH_DEVICE. In fact, that's the scenario for all plug-ins written up to this point.

@akutz akutz force-pushed the feature/node-publish-device branch from 59b4f8d to e5b4dbc Compare October 18, 2017 19:11
@saad-ali
Copy link
Member

Maybe I'm missing something, but how does that work?. You can have a scenario where a Controller Plugin supports PUBLISH_UNPUBLISH_VOLUME and a Node plugin doesn't support PUBLISH_UNPUBLISH_DEVICE. In fact, that's the scenario for all plug-ins written up to this point.

The argument here would be that plugins that support PUBLISH_UNPUBLISH_VOLUME must also support PUBLISH_UNPUBLISH_DEVICE. Doing so would achieve the simplification @jdef was suggesting would be lost. Again, I don't feel strongly about this either way.

@akutz
Copy link
Contributor Author

akutz commented Oct 18, 2017

Hi @saad-ali,

The argument here would be that plugins that support PUBLISH_UNPUBLISH_VOLUME must also support PUBLISH_UNPUBLISH_DEVICE. Doing so would achieve the simplification @jdef was suggesting would be lost. Again, I don't feel strongly about this either way.

In this case I think simple isn't better since the two RPCs are not bound together -- it's easy to imagine a case for one without the other. I will leave it as it is.

@saad-ali
Copy link
Member

In this case I think simple isn't better since the two RPCs are not bound together -- it's easy to imagine a case for one without the other. I will leave it as it is.

I'm fine with that. @jdef?

@jdef
Copy link
Member

jdef commented Oct 20, 2017

To be clear, I wasn't suggesting that we somehow merge ControllerPublishVolume and NodePublishDevice to achieve simplicity. Nor was I suggesting that we push non-Identity things into the Identity service.

I was thinking along the lines of simply making NodePublishDevice a required call (just like NodePublishVolume). The goal for publishing a device seems more related to the requirement of a CO (k8s and how it wants to manage volume lifecycle) than a capability that a plugin would/would-not offer. We don't have a way to represent CO capabilities/requirements. Making NodePublishDevice required seems simpler to me, because it makes consistent the expectations placed upon a CO.

For example, if a plugin doesn't offer a PUBLISH_UNPUBLISH_DEVICE capability, how is k8s (or some other CO w/ similar goals) supposed to interact with that plugin? Vendors are basically REQUIRED to implement that capability for compatibility with k8s. Mesos has similar issues because the current design requires GET_CAPACITY for plugins that will offer volumes that feed into the Resource cycle. Mesos will (eventually) offer ways to integrate plugins that don't support GET_CAPACITY. Are there any k8s plans to support plugins that don't implement PUBLISH_UNPUBLISH_DEVICE?

If a plugin DOES offer the PUBLISH_UNPUBLISH_DEVICE capability but a CO doesn't need it -- it basically forces the CO down the "required" implementation path anyway because it becomes a call that the CO is REQUIRED (by the spec) to invoke prior to NodePublishVolume. Furthermore it forces such a CO to cook up unique global_target_path fields (in order to remain spec compliant).

I can't imagine that many vendors will want to ignore k8s. So NodePublishVolume basically becomes de facto REQUIRED. Why not model it that way from the start?

The concept of publishing a device prior to publishing a volume for a workload feels like it has significant impact to volume lifecycle and CO implementation. I dislike the idea of crafting an option that is, in effect, required anyway: such an approach both unnecessarily complicates the spec and creates confusion.

Am I misunderstanding something fundamental to the discussion?

@akutz
Copy link
Contributor Author

akutz commented Oct 20, 2017

Hi @jdef,

CSI isn't beholden to K8s more than any other CO. The fact is some COs may not wish to own reference counting. This is a case of adding a feature, not taking one away. Why remove a capability when adding one? I don't see the necessity in doing so.

Use case wise I can see where a CSI proxy may wish to advertise that it owns reference counting in order to perform some additional actions as part of the workflow.

@jdef
Copy link
Member

jdef commented Oct 20, 2017 via email

@akutz
Copy link
Contributor Author

akutz commented Oct 20, 2017

The proxy may not wish the CO to participate in any reference counting and thus indicates to the CO that NodePublishDevice is not supported, even if the plug-ins to which the middleware is communicates do support ref counting.

Ultimately though I'm concerned that such a significant change like making this a required part of CSI's primary workflow would occur after the item was discussed at the CSI community meeting. This is like when riders are added to bills that have already been reviewed, but before the floor votes. It simply smells wrong.

@jdef If you want to make this a required RPC then I would like the rest of the community to have a chance to participate in this discussion. Your proposed addendum alters this PR in such a substantial manner that it would no longer be the same change approved by the community during this week's meeting.

@saad-ali
Copy link
Member

PUBLISH_UNPUBLISH_DEVICE is not required by Kubernetes. Kubernetes would work fine if a volume plugin did not implement it. Consider the NFS plugin case, a PublishDevice step makes no sense in that case. So I would be against making this a requirement instead of optional.

@@ -945,23 +1076,29 @@ message NodePublishVolumeRequest {
// this capability. This is an OPTIONAL field.
PublishVolumeInfo publish_volume_info = 3;

// The path to which the device was mounted by `NodePublishDevice`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It MUST be passed in if the node plugin implements the PUBLISH_UNPUBLISH_DEVICE node capability.
It MUST not be left empty if the the node plugin does not implement the PUBLISH_UNPUBLISH_DEVICE node capability.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree credentials are not required in here nor do I think they should be included in ControllerPublish. IMO commands that are to be called once and not once per workload do not need to include the user/workload credentials. Otherwise it opens up consistency issues.

This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
This RPC MUST be called by the CO once per node, per volume.
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE`, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

spec.md Outdated

// End user credentials used to authenticate/authorize node publish
// request. This field is OPTIONAL.
Credentials user_credentials = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on this?

@jdef
Copy link
Member

jdef commented Oct 20, 2017

Thanks for clarifying @saad-ali, sounds like I made a bad assumption. I'll withdraw my objection: we can keep it as optional.

I wonder if it's worth considering wrapping the global_target_path in a DevicePublishInfo message that we could easily expand later as needed, for passing along information from NodePublishDeviceResponse to NodePublishVolumeRequest. Thoughts?

@akutz
Copy link
Contributor Author

akutz commented Oct 20, 2017

I like that idea James. Then again, I also liked keeping the volume handle the way it was :)

@akutz akutz force-pushed the feature/node-publish-device branch from e5b4dbc to 4df353c Compare October 23, 2017 17:49
@jieyu
Copy link
Member

jieyu commented Oct 23, 2017

Thanks for the PR @akutz.

I am wondering what's the relationship between PUBLISH_UNPUBLISH_DEVICE capability and AccessMode? In fact, we should clarify that when we allow NodePublishVolume to be called multiple times on a node. For instance, does SINGLE_NODE_WRITER mean that NodePublishVolume can only be called once? or NodePublishDevice can only be called once? or ControllerPublishVolume can only be called once?

@akutz
Copy link
Contributor Author

akutz commented Oct 23, 2017

Hi @jieyu,

NodePublishDevice should be called once per volume, per node, regardless of AccessMode. The bind mount plug-ins will perform in NodePublishVolume is what ultimately determines how the AccessMode is handled. COs should not be accessing the filesystem at the global_target_path that the volume is mounted during NodePublishDevice.

This patch handles issue container-storage-interface#119 by adding two new RPCs,
"NodePublishDevice" and "NodeUnpublishDevice". These RPCs MUST be called
by the CO if the Node Plugin advertises the "PUBLISH_UNPUBLISH_DEVICE"
capability. Plugins that advertise this capability SHOULD defer volume
reference counting to the CO.
@akutz akutz force-pushed the feature/node-publish-device branch from 4df353c to 9fcfd8c Compare October 24, 2017 18:07
@akutz
Copy link
Contributor Author

akutz commented Oct 24, 2017

Hi All,

I believe I've rebased this PR to account for everyone's notes. @jdef, I'm opting to leave global_target_path its own field for now as planning ahead with the spec has often proven to be a fool's errand given the number of recent modifications. And if things do change, that's the purpose of the API version.

@jieyu, I am wondering if your concern regarding AccessMode shouldn't result in this change to NodePublishDeviceRequest:

message NodePublishDeviceRequest {
  // The API version assumed by the CO. This is a REQUIRED field.
  Version version = 1;

  // The ID of the volume to publish. This field is REQUIRED.
  string volume_id = 2;

  // The CO SHALL set this field to the value returned by
  // `ControllerPublishVolume` if the corresponding Controller Plugin
  // has `PUBLISH_UNPUBLISH_VOLUME` controller capability, and SHALL be
  // left unset if the corresponding Controller Plugin does not have
  // this capability. This is an OPTIONAL field.
  PublishVolumeInfo publish_volume_info = 3;

  // The path to which the volume will be published. It MUST be an
  // absolute path in the root filesystem of the process serving this
  // request. The CO SHALL ensure uniqueness of global_target_path per
  // volume.
  // This is a REQUIRED field.
  string global_target_path = 4;

  // The capability of the volume the CO expects the volume to have.
  // This is a REQUIRED field.
  // Specifies what API the volume will be accessed using. One of the
  // following fields MUST be specified.
  oneof access_type {
    BlockVolume block = 5;
    MountVolume mount = 6;
  }
}

As you indicated, NodePublishDevicehas no relationship to the requested AccessMode, and therefore the only real information that is required with respect to the VolumeCapability is the access_type.

@jdef
Copy link
Member

jdef commented Oct 26, 2017

Whichever PR lands first, we should probably consider adding volume_attributes to NodePublishDevice at some point #123

@jdef
Copy link
Member

jdef commented Feb 22, 2018

I think this was superseded by #169, right? If so we can probably close this out

@jdef
Copy link
Member

jdef commented Feb 26, 2018

Closing this out as per my previous comment. If I'm wrong, please re-open.

@jdef jdef closed this Feb 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants