Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mount option for Cephfs #576

Merged
merged 2 commits into from
Sep 6, 2019
Merged

Add mount option for Cephfs #576

merged 2 commits into from
Sep 6, 2019

Conversation

poornimag
Copy link

@poornimag poornimag commented Aug 28, 2019

The storage class already takes MountOptions(MountFlags), these are the
bind mount options. Some of these options may not be recognised by the
cephfs mount. Hence added a new parameter in Storage Class for cephfs
kernel mount options. Ceph kernel mount options are different from
ceph-fuse options, and there are not many ceph-fuse options. Thus the
parameter takes only the ceph kernel mount options and ceph kernel mount
is choosen only when kernel is >=4.17.

@@ -20,6 +20,10 @@ parameters:
# (optional) Ceph pool into which volume data shall be stored
# pool: cephfs_data

# (optional) Comma seperated string of Cephfs kernel or ceph fuse
# mount options. Check man mount.ceph for mount options. For eg:
# mountOptions: readdir_max_bytes=1048576,norbytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 41 there is an another mountOptions parameter. What's that for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that will be for binding mounting staging path to target path which will be used during node publish

@@ -128,6 +128,10 @@ func mountFuse(mountPoint string, cr *util.Credentials, volOptions *volumeOption
args = append(args, "--client_mds_namespace="+volOptions.FsName)
}

if volOptions.MountOptions != "" {
args = append(args, volOptions.MountOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't you have to append '-o'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-o is already present at line 124

@@ -20,6 +20,10 @@ parameters:
# (optional) Ceph pool into which volume data shall be stored
# pool: cephfs_data

# (optional) Comma seperated string of Cephfs kernel or ceph fuse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/seperated/separated/

s/Cephfs kernel or ceph fuse mount options/CephFS kernel or FUSE mount options/

@@ -20,6 +20,10 @@ parameters:
# (optional) Ceph pool into which volume data shall be stored
# pool: cephfs_data

# (optional) Comma seperated string of Cephfs kernel or ceph fuse
# mount options. Check man mount.ceph for mount options. For eg:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/for mount options/for kernel mount options/

Copy link
Contributor

@ajarr ajarr Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@humblec
Copy link
Collaborator

humblec commented Aug 29, 2019

@poornimag is this PR still in WIP ? or you can remove it from the title.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Aug 29, 2019

@poornimag is this tested?

@poornimag poornimag force-pushed the mount-options branch 2 times, most recently from 00bbfe4 to a4bcdfa Compare August 29, 2019 18:18
@humblec
Copy link
Collaborator

humblec commented Aug 31, 2019

@poornimag can you please revisit this PR?

@humblec
Copy link
Collaborator

humblec commented Sep 3, 2019

@poornimag looks like a spurious failure, retarted the CI, btw, is this still in WIP?

Copy link
Contributor

@ShyamsundarR ShyamsundarR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding another mountOptions field here? We should leverage req.GetVolumeCapability().GetMount().GetMountFlags() that is already passed down from the storage class to add/append the same to the mount request. This is what is used in rbd for instance.

Unless the options to the fuse CephFS command (or kernel mount) are different than regular mount options, there should be no need to add an extra storage class key here to provide the same.

Reference Kubernetes PR: https://github.com/kubernetes/kubernetes/pull/67898

Am I missing something here?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 4, 2019

@poornimag please rebase with latest changes

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 4, 2019

@ShyamsundarR I think these options are specific to cephfs, not the filesystem mount options we use for mounting target path

@ShyamsundarR
Copy link
Contributor

@ShyamsundarR I think these options are specific to cephfs, not the filesystem mount options we use for mounting target path

Different file systems will have different options, which are passed along using the mount options. In the case of CephFS it has its own mount options that maybe different than other file systems in use, but the manner of passing these should be via the standard CSI protocol rather than creating a separate option key in the storage class.

Further the mount options are not specific to staging or target path, and should be used at an appropriate phase in the node operations.

In the case of RBD, the options during mapping the image needs to come from a separate option key, as these are not mount time options, but other options when mounting the FS of choice on the device comes from mount options.

We should remove the newly added options, and use what is provided by the protocol in this situation.

@poornimag
Copy link
Author

@ShyamsundarR I think these options are specific to cephfs, not the filesystem mount options we use for mounting target path

Different file systems will have different options, which are passed along using the mount options. In the case of CephFS it has its own mount options that maybe different than other file systems in use, but the manner of passing these should be via the standard CSI protocol rather than creating a separate option key in the storage class.

Further the mount options are not specific to staging or target path, and should be used at an appropriate phase in the node operations.

In the case of RBD, the options during mapping the image needs to come from a separate option key, as these are not mount time options, but other options when mounting the FS of choice on the device comes from mount options.

We should remove the newly added options, and use what is provided by the protocol in this situation.

So there are option like ro, noexec which makes sense only for bindfs, there are options specific to cephfs, how do we differentiate the options whether its for bind mount or ceph mount? isn't the mount options in SC are for the publish step, as in bindfs?

@poornimag poornimag force-pushed the mount-options branch 4 times, most recently from cdb90da to 6e39f65 Compare September 4, 2019 15:10
@ShyamsundarR
Copy link
Contributor

@ShyamsundarR I think these options are specific to cephfs, not the filesystem mount options we use for mounting target path

Different file systems will have different options, which are passed along using the mount options. In the case of CephFS it has its own mount options that maybe different than other file systems in use, but the manner of passing these should be via the standard CSI protocol rather than creating a separate option key in the storage class.
Further the mount options are not specific to staging or target path, and should be used at an appropriate phase in the node operations.
In the case of RBD, the options during mapping the image needs to come from a separate option key, as these are not mount time options, but other options when mounting the FS of choice on the device comes from mount options.
We should remove the newly added options, and use what is provided by the protocol in this situation.

So there are option like ro, noexec which makes sense only for bindfs, there are options specific to cephfs, how do we differentiate the options whether its for bind mount or ceph mount? isn't the mount options in SC are for the publish step, as in bindfs?

The options specified are not bindfs specific but vfsoptions. The way forward looks to be,

  • Apply all mount options at stage (including ro derived from req.GetReadOnly, which is safer anyway)
  • Publish should only bind mount (hence no remount, as options do not change, for the bind mount)

The MountOptions from storage class exists prior to CSI, and afaict is to accept any mount option, fs specific of vfs specific or otherwise.

Further, current csi-cephfs code takes in all mount options from MountFlags and uses them during the bind mount, this works as options that are not recognized are dropped. This would need to change as well in this case.

@humblec
Copy link
Collaborator

humblec commented Sep 5, 2019

My suggestion would be applying the mount options at staging and discard the possibilities of bind mount. One thing which I am not sure here is, does kcephfs has anything special about this mount options?

@ajarr
Copy link
Contributor

ajarr commented Sep 5, 2019

One thing which I am not sure here is, does kcephfs has anything special about this mount options?

It should be OK to pass kcephfs mount options with -o. See https://docs.ceph.com/docs/master/man/8/mount.ceph/#options

@poornimag poornimag force-pushed the mount-options branch 5 times, most recently from 320d7c8 to 1eecce3 Compare September 5, 2019 13:05
@poornimag poornimag changed the title [WIP]Add mount option for Cephfs Add mount option for Cephfs Sep 5, 2019
@poornimag
Copy link
Author

mount output from different pods:

From the application pod:
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/www type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)

From the Nodeplugin pod:
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-48bd5014-1193-4f9b-990c-b745849b92a0/globalmount type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/kubelet/pods/59100c04-9d64-4426-a184-c5858fc9504e/volumes/kubernetes.io~csi/pvc-48bd5014-1193-4f9b-990c-b745849b92a0/mount type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)

humblec
humblec previously approved these changes Sep 6, 2019
@humblec
Copy link
Collaborator

humblec commented Sep 6, 2019

LGTM. @Madhu-1 PTAL.

@@ -79,6 +79,7 @@ is used to define in which namespace you want the configmaps to be stored
| `fsName` | yes | CephFS filesystem name into which the volume shall be created |
| `mounter` | no | Mount method to be used for this volume. Available options are `kernel` for Ceph kernel client and `fuse` for Ceph FUSE driver. Defaults to "default mounter", see command line arguments. |
| `pool` | no | Ceph pool into which volume data shall be stored |
| `mountOptions` | no | Comma seperated string of mount options accepted by cephfs kernel and/or fuse mount. by default no options are passed. Check man mount.ceph for options. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poornimag do we still need this one?

@@ -197,8 +198,13 @@ func mountKernel(ctx context.Context, mountPoint string, cr *util.Credentials, v
if volOptions.FsName != "" {
optionsStr += fmt.Sprintf(",mds_namespace=%s", volOptions.FsName)
}
if volOptions.MountOptions != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about mount options for ceph.fuse?

@humblec
Copy link
Collaborator

humblec commented Sep 6, 2019

@poornimag may be I overlooked , actually whatever is the default SC option ( mountOptions) can be used for our purpose. We dont have to worry about whether its for bind mount or for ceph-fuse or for kcephfs. Instead of adding a new parameter, my suggestion would be using the 'default' available parameter and using it with our 'mounter' . If the mounter does not take that arg, we can fail, we can also do a logic that, mountoptions are applied only when our selected mounter is kernel , otherwise warn and discard.

Does it make sense?

@poornimag
Copy link
Author

poornimag commented Sep 6, 2019

mount output from different pods:

From the application pod:
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/www type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)

From the Nodeplugin pod:
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-48bd5014-1193-4f9b-990c-b745849b92a0/globalmount type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)
10.101.233.60:6789:/volumes/csi/csi-vol-3f907a88-d06a-11e9-84b5-0242ac11000f on /var/lib/kubelet/pods/59100c04-9d64-4426-a184-c5858fc9504e/volumes/kubernetes.io~csi/pvc-48bd5014-1193-4f9b-990c-b745849b92a0/mount type ceph (rw,relatime,name=admin,secret=,dirstat,acl,mds_namespace=myfs,wsize=16777216)

@poornimag may be I overlooked , actually whatever is the default SC option ( mountOptions) can be used for our purpose. We dont have to worry about whether its for bind mount or for ceph-fuse or for kcephfs. Instead of adding a new parameter, my suggestion would be using the 'default' available parameter and using it with our 'mounter' . If the mounter does not take that arg, we can fail, we can also do a logic that, mountoptions are applied only when our selected mounter is kernel , otherwise warn and discard.

Does it make sense?

The problem i see is, the options for ceph kernel, ceph fuse and bindfs are different. How do we differentiate which options are for which client mount? Eg: consider the "debug" mount option, the kernel mount fails if i specify debug
[root@Centos1 /]# mount -t ceph 10.101.233.60:6789:/ /mnt/myfs-kernel/ -o debug,name=admin,secretfile=/etc/ceph/admin
mount error 22 = Invalid argument
[root@Centos1 /]#

But ceph-fuse succeeds for debug option and also bindfs:
[root@Centos1 /]# ceph-fuse /mnt/myfs-fuse/ -o debug
ceph-fuse[2271]: starting ceph client2019-09-06 08:07:24.881 7f7646f69e00 -1 init, newargv = 0x55d18355d960 newargc=9
FUSE library version: 2.9.2
ceph-fuse[2271]: starting fuse
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.22
flags=0x0000f7fb
max_readahead=0x00020000
[root@Centos1 /]#

Now if i used the mount options in SC and debug is specified, and kernel >4.17 then the mount itself fails and hence binding fails.

Also, ceph-fuse mount options [1] and kernel mount options [2] are different, i do not understand why this is the case, may be @ajarr can clarify this. eg dirstat is not recognized by fuse client:
[root@Centos1 /]# ceph-fuse /mnt/myfs-fuse/ -o debug,dirstat
ceph-fuse[2370]: starting ceph client2019-09-06 08:10:58.104 7ff267735e00 -1 init, newargv = 0x55aa4540f970 newargc=9
fuse: unknown option `dirstat'
ceph-fuse[2370]: fuse failed to start
2019-09-06 08:10:58.116 7ff267735e00 -1 fuse_lowlevel_new failed
[root@Centos1 /]#

Because of the above cited reasons, have introduced another mountoption parameter, and that parameter is only passed for kernel mount(not fuse mount as there are not many options for ceph-fuse).

[1] https://docs.ceph.com/docs/mimic/man/8/ceph-fuse/
[2] https://docs.ceph.com/docs/giant/man/8/mount.ceph/

@Madhu-1 Madhu-1 added the Priority-0 highest priority issue label Sep 6, 2019
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 6, 2019

@poornimag we should not worry about the mount options what the user specifies, even we cant validate it also, I think if it fails its user responsibility to provide proper mount options I believe

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 6, 2019

@humblec @ajarr please review this PR on top priority this is a blocker for 1.2.0 release

@humblec
Copy link
Collaborator

humblec commented Sep 6, 2019

@poornimag we should not worry about the mount options what the user specifies, even we cant validate it also, I think if it fails its user responsibility to provide proper mount options I believe

sure, but one interesting fact here is that, the selection of mounter is from the plugin based on the kernel version ..etc, so user cannot upfront decide on what he has to supply. Documentation can help to an extent. One other choice would be specifying the values of mountOptions with a prefix or tag.
With that, we can use same param for all the cases.

For ex:

mountOptions: ceph-fuse:debug, kcephfs:something .etc

Then we can parse it and use accordingly. May be not much optimal, but we can survive.

Any thoughts?

But @poornimag I completely accept or understand the difficulty to have a general solution here.!! just trying to pave the way somehow :) , Seperate flag for kernel which this PR does also a good solution, so I am fine with this format too.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 6, 2019

@poornimag we should not worry about the mount options what the user specifies, even we cant validate it also, I think if it fails its user responsibility to provide proper mount options I believe

sure, but one interesting fact here is that, the selection of mounter is from the plugin based on the kernel version ..etc, so user cannot upfront decide on what he has to supply. Documentation can help to an extent. One other choice would be specifying the mountOptions with a prefix or tag.

For ex:

ceph-fuse:debug, kcephfs:something .etc

Then we can parse it and use accordingly. May be not much optimal, but we can survive.

Any thoughts?

But @poornimag I completely accept or understand the difficulty to have a general solution here.!! just trying to pave the way somehow :) , Seperate flag for kernel which this PR does also a good solution, so I am fine with this format too.

am fine with documentation as the user knows the kernel version where he is deploying the cephcsi.
we don't need to prefix anything to mountOptions, we can use the mount options directly during mounting. this make the implementation and maintenance simple

@humblec
Copy link
Collaborator

humblec commented Sep 6, 2019

I would like to summarise the outcome of a quick discussion ( @Madhu-1 , @poornimag ..) on this 👍

The proposal is to have 2 different parameters in SC for kernel and fuse mounters.

kernelMountOptions for kcephfs
fuseMountOptions for cephfs-fuse

The is based on below possibilities.

*) Different mounters can be selected by the plugin for PVCs created from same SC
*) SC admin would/can be different from the Cluster Admin , so difficult to decide on the mounter and options upfront.
*) cephfs-fuse and kcephfs options are not in parity.
*) More options can be added to each mounter in future.

@mergify mergify bot dismissed humblec’s stale review September 6, 2019 10:24

Pull request has been modified.

docs/deploy-cephfs.md Outdated Show resolved Hide resolved
# (optional) Comma seperated string of Cephfs kernel mount options.
# Check man mount.ceph for mount options. For eg:
# kernelMountOptions: readdir_max_bytes=1048576,norbytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also document what mountOptions are. It's confusing. It's only for bindfs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get? You mean i need to define what mount options are? mount options are quite generic term, its also present in the storage class spec.

Copy link
Contributor

@ajarr ajarr Sep 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant in line 45 there is an another mountOptions. What's that for?

The storage class already takes MountOptions(MountFlags), these are the
bind mount options. Some of these options may not be recognised by the
cephfs mount. Hence added a new parameterin Storage Class for
- cephfs kernel mount options,
- ceph-fuse mount options

Ceph kernel mount options are different from ceph-fuse options, hence
added two different parameters.

Signed-off-by: Poornima G <[email protected]>
@ShyamsundarR
Copy link
Contributor

I would like to summarise the outcome of a quick discussion ( @Madhu-1 , @poornimag ..) on this +1

The proposal is to have 2 different parameters in SC for kernel and fuse mounters.

kernelMountOptions for kcephfs
fuseMountOptions for cephfs-fuse

The is based on below possibilities.

*) Different mounters can be selected by the plugin for PVCs created from same SC
*) SC admin would/can be different from the Cluster Admin , so difficult to decide on the mounter and options upfront.
*) cephfs-fuse and kcephfs options are not in parity.
*) More options can be added to each mounter in future.

I think we are complicating this, failing to mount because of invalid mount options is acceptable. Like in the discussion https://github.com/kubernetes/kubernetes/pull/67898#pullrequestreview-149763072.

Because we do not have a stricter mounter default, allowing that to appear in mount options in addition seems undesirable.

Once we add extra keys/options in the storage class, removing them later would not be easy as this is deployed and used in multiple places.

I strongly suggest we stick to the default mount options parameters and report failures if the mounter does not support them for the sysadmin to fix up the storage class or the mounter selection.

@humblec
Copy link
Collaborator

humblec commented Sep 6, 2019

I think we are complicating this, failing to mount because of invalid mount options is acceptable. Like in the discussion kubernetes/kubernetes#67898 (review).

Above discussion is different, its about, wrong supply from user, here the story is different, that said, as mentioned in earlier comments, mounter is different or picked by plugin at times and also mount options are not in parity..etc. One option which is actually correct for one mounter can be wrong for another mount.

Because we do not have a stricter mounter default, allowing that to appear in mount options in addition seems undesirable.

I think thats where it adds flexibility.

Once we add extra keys/options in the storage class, removing them later would not be easy as this is deployed and used in multiple places.

If really required, we can do deprecation eventually with couple of releases in place. Thats fine. Also can be done by filling the mountOptions internally or from the code side, based on any of the newly added flag values..etc.

I strongly suggest we stick to the default mount options parameters and report failures if the mounter does not support them for the sysadmin to fix up the storage class or the mounter selection.

Again mounter can be picked by the plugin and it could be different for PVCs from same SC based on the cluster node configuration..etc, so we can not really fill admin's thoughts without failure.

With all of above thoughts, I am fine with current approach.

Approving this PR.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Sep 6, 2019

Agreed with humble

Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify mergify bot merged commit 060ff8d into ceph:master Sep 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority-0 highest priority issue release-1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants