Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasets are not shared reliably #2883

Closed
dswartz opened this issue Nov 10, 2014 · 16 comments
Closed

datasets are not shared reliably #2883

dswartz opened this issue Nov 10, 2014 · 16 comments
Labels
Component: Share "zfs share" feature

Comments

@dswartz
Copy link
Contributor

dswartz commented Nov 10, 2014

I've been chasing an extremely annoying bug, where I reboot a CentOS7 server, and the dataset that is supposed to be shared via NFS is not. I finally figured it out. systemd does 'zfs share -a', which (eventually) ends up calling is_shared(), which scans the sharetab (/etc/dfs/sharetab), looking for the dataset in question. Here is the problem: apparently on shutdown, systemd is not unsharing the datasets, so test/foo is still in the sharetab, so 'zfs share -a' thinks nothing needs to be done. Now, you can say this is a bug in systemd, but consider what happens if you have a crash (so systemd never got the chance to unshare the dataset). It will still be in the sharetab, and therefore won't be shared on reboot. Maybe the sharetab should be getting blown away early in the game?

@behlendorf
Copy link
Contributor

@dswartz Since ZFS is managing the sharetab I think your suggestion is pretty reasonable. Having systemd/init blow it away prior to the share step on startup and after the unshare on shutdown.

@dswartz
Copy link
Contributor Author

dswartz commented Nov 12, 2014

I didn't notice I had opened #2792 awhile back. Closed that as a duplicate.

@FransUrbo
Copy link
Contributor

@behlendorf How about having zpool import clearing the sharetab instead? Makes more sense that "we" clear up our own "mess" instead of trusting "someone else" to do it...

@dswartz
Copy link
Contributor Author

dswartz commented Nov 13, 2014

@behlendorf How about having zpool import clearing the
sharetab instead? Makes more sense that "we" clear up our own "mess"
instead of trusting "someone else" to do it...

That's an interesting idea. Let me look at what's involved...

@behlendorf
Copy link
Contributor

@FransUrbo that's a great idea. At import time we can be absolutely certain that none of the datasets are shared so they all all be safely removed from the sharetab. We'd just need to walk all the datasets after import and remove those.

@dswartz
Copy link
Contributor Author

dswartz commented Nov 13, 2014

@FransUrbo that's a great idea. At import time we can be absolutely
certain that none of the datasets are shared so they all all be safely
removed from the sharetab. We'd just need to walk all the datasets after
import and remove those.

I'm digging through the code now to figure out the cleanest way to do
this.. Amusingly, cleanup of stale sharetab entries happens anytime
sa_fini() is called, which is called by zfs_uninit_libshare(). So, if you
created a new dataset and shared it, and printed out the contents of the
sharetab, the leftover entries have disappeared. This makes complete
sense of the behavior I saw where setting sharenfs=off and then back on
suddenly 'fixed' everything...

@FransUrbo
Copy link
Contributor

We'd just need to walk all the datasets after import and remove those.

Just remove all datasets that starts with the pool we just imported. Should be simple.

It might get a little tricker if/when we get to a iscsi device:

/dev/zvol/share/VirtualMachines/Negotia/Swap - iscsi name=iqn.2012-11.com.bayour:share.virtualmachines.negotia.swap,initiator=iqn.1993-08.org.debian:01:e19b61b8377;iqn.2009-08.com.sun.virtualbox.initiator:01:192.168.69.3,blocksize=512

Can we be absolutly sure that any device '/dev/zvol/share/...' belong to the 'share' pool (it is in my case, but…)?=

@dswartz
Copy link
Contributor Author

dswartz commented Nov 21, 2014

Okay, I've spent a fair amount of time groveling through all of this. My objection to changing 'zpool import' or whatever is that it complicates the code quite a bit. It's already tricky how the sharetab is updated when shares are added/removed. We already know that the contents of /etc/dfs/sharetab are useless (stale) on reboot, and there is no point in deleting the sharetab on shutdown, since we can't trust a crash that would leave the stale sharetab around. A trivial change I just tested: add a file 'sharetab.conf' in /usr/lib/tmpfiles.d that does this:

# Remove /etc/dfs/sharetab on reboot
r /etc/dfs/sharetab

It's certainly possible that I'm missing something, but the above has the virtue of being trivially simple, while also being semantically correct. We are 100% sure that the contents of the sharetab are useless on reboot.

@FransUrbo
Copy link
Contributor

Okay, I've spent a fair amount of time groveling through all of this. My objection to changing 'zpool import' or whatever is that it complicates the code quite a bit.

In Debian GNU/Linux we use a /etc/default/zfs file where the user/admin can specify wether to use /dev/disk/by-id, do mounting, sharing etc, etc.

I just added the following line at the end:

[ -f /etc/dfs/sharetab ] && rm /etc/dfs/sharetab

Since the init script sources this

. /etc/default/zfs

this is 'good enough'.

Does any of the other dists use something similar?

@dswartz
Copy link
Contributor Author

dswartz commented Nov 24, 2014

I'm curious. Who/what generates the /etc/default/zfs file for ubuntu/debian? Something specific to those distros? I'm not finding anything in the zfs code itself (closest I can find is the stuff that creates the /etc/init.d/zfs init script for lsb?) Is this a request we should forward to whoever maintains those packages? It's Darik for ubuntu, right? Who for debian?

@dajhorn
Copy link
Contributor

dajhorn commented Nov 24, 2014

Who/what generates the /etc/default/zfs file for ubuntu/debian?

This file is usually generated from a template in the debian/ overlay by the dh_installinit utility according to system policy.

Something specific to those distros?

Yes, although most DEB distributions handle it similarly according to the LSB.

It's Darik for ubuntu, right? Who for debian?

@FransUrbo

@dswartz
Copy link
Contributor Author

dswartz commented Nov 24, 2014

Oh, right. Do you guys think this is a reasonable approach for debian&ubuntu then? It can't really be addressed by any PR I open...

@FransUrbo
Copy link
Contributor

Have a look at #2106. These are the scripts etc from the Debian GNU/Linux packages. Hopefully, we could get rid of the crap that's in the repo now. Crap because they're way out of date, they are unmaintainable and dist specific.

@dswartz
Copy link
Contributor Author

dswartz commented Dec 5, 2014

The tmpfiles.d idea turned out to not work reliably. Sometimes it did, sometimes not. I suspect some kind of timing issue. I went back and looked at zfs-share.service, and the easiest fix seems to be much simpler and cleaner. To wit: adding this line:
ExecStartPre=-@bindir@/rm /etc/dfs/sharetab

I understand this doesn't help anyone not using systemd, but it does seem to resolve this for those folks, so unless someone objects strongly, I will open a PR for this.

@dswartz
Copy link
Contributor Author

dswartz commented Dec 7, 2014

Opened PR #2950.

@dswartz
Copy link
Contributor Author

dswartz commented Dec 15, 2014

I would like to get this change submitted, unless someone objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Share "zfs share" feature
Projects
None yet
Development

No branches or pull requests

4 participants