Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete rewrite of init script(s) - attempt 3 #3337

Closed
wants to merge 1 commit into from

Conversation

FransUrbo
Copy link
Contributor

  • Base scripts on the init scripts included with Debian GNU/Linux
    then take code from the already existing ones, trying to merge
    them into one.
    • NFS server service is just called 'nfs' in Redhat based system, and
      nfs-kernel-server on Debian GNU/Linux based.
    • Merge Inform OpenRC that ZFS uses mtab #2148 - Inform OpenRC that ZFS uses mtab.
    • Stop after umountfs
    • When stopping, don't check for LOCKDIR/zfs-mount - it won't be mounted,
      so the file won't exist making the script not run.
    • Add a configurable ZFS_INITRD_POST_MODPROBE_SLEEP used in the initrd to sleep after the modprobe.
    • The import function, do_import(), imports pools by name instead of '-a' [all].
      • Test all '/dev/disk/by-*' dirs for import. Include /dev as a last ditch attempt.
    • Fallback on importing the pool using the cache file (if it exists) if the
      'by-id' didn't work.
    • Add exceptions to pool imports.
  • ZED script from the Debian GNU/Linux packages added.

Signed-off-by: Turbo Fredriksson [email protected]
Closes: #2974, #2107.

@behlendorf
Copy link
Contributor

OK, that's my first round of comments. I'd say the biggest thing would be to split it in to .lsb (or .debian if you like) and .redhat scripts. I completely agree that a single set of scripts would be ideal but I think that would probably be less maintainable in the long term. Plus if we agree up front to only these two versions I think that's reasonable. So the following set of scripts would make sense to me.

debian script redhat script Description
zfs-common.lsb.in zfs-common.redhat.in Common shared ZFS support functions
zfs-import.lsb.in zfs-import.redhat.in Import the ZFS pool
zfs-mount.lsb.in zfs-mount.redhat.in Mount the ZFS managed filesystems (non-legacy)
zfs-share.lsb.in zfs-share.redhat.in Share the ZFS managed filesystem
zed.lsb.in zed.redhat.in Start the ZFS Event Daemon (ZED) daemon

@FransUrbo Does this sound reasonable to you? I'm sure your more familiar with many of nitty-gritty details here than I am. But based on past experience with this sort of thing I think this is the most maintainable approach going forward.

NOTE: For consistency I keep wondering if the zed script shouldn't be called zfs-zed.
NOTE: I'm happy to create and test the .redhat.in versions of these scripts based on your lsb version.

@FransUrbo
Copy link
Contributor Author

@FransUrbo Does this sound reasonable to you?

In short: No. Longer answer: No, f**k no! :).

The five scripts we have now is in such … well, let's use plain english - crappy state - as it is. They're not maintained, they all [now] have different functionality and some bug fixes is only applied in ONE of them (which means the bug is still in the other four scripts).

So if you want separate scripts like that, you need to start by making sure it's maintained! By whoever wrote them and is 'responsible' for them. And that that/those people know about what the other(s) are doing to THEIR script. This isn't any where near happening now...

WIth this PR, we have ONE script that imports and mounts the fs (although I agree that splitting those two functions into two scripts 'zfs-import' and 'zfs-mount', much like the systemd scripts/config files).

And yes, it MIGHT require us to test the script in more than one place, but then we know that if we find an issue in the script, that issue is then fixed for EVERYONE. Not just a few…

PORTABILITY is the keyword here. Not simplicity!

NOTE: For consistency I keep wondering if the zed script shouldn't be called zfs-zed.

Sounds very reasonable.

@behlendorf
Copy link
Contributor

OK. Well how about this. Go ahead and refresh this with markups we've discussed and keep it as just the one version. Then when you're ready I'll give it another spin under CentOS 6 and see what issues I encounter in practice and have to work though. If it turns out in practice I've been making a mountain out of the molehill and the discrepancies are minor we'll keep it as one.

FWIW if that is possible that's my preference as well for all the reasons you've mentioned. I'm just a bit concerned it won't be.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch 2 times, most recently from fa6524b to 29ad8f6 Compare April 29, 2015 23:16
@FransUrbo
Copy link
Contributor Author

[I replied to a code comment, so put this back here with the PR]

refresh this with markups we've discussed and keep it as just the one version.

Done. But I split up zfs-mount into two: zfs-import and zfs-mount. I still need to verify if that works in practice, but we have something to go forward with with this.

I also renamed the etc/init.d/zfs-{import,mount,share}.in files (removed the .in suffix). It wasn't needed - all path config is in the zfs-common.in script and is the only one that actually needs to be sed'd.

I also took another look at the zfs-zed script. It's VERY Debian GNU/Linux centric at the moment. I've missed to fix this, so we need to run that through a couple of times to.

If it's possible that's may preference as well, I'm just a bit concerned it won't be.

Looking at the scripts and the function, there should be very little need to tweak the current code. It's quite generic. But additional functions for special corner cases for certain (as yet unnamed) distributions might be needed [eventually].

The only place I recon we're going to need to tweak is the 'logging stuff'. I know it's crude, but in the end I think it's ok - it's "out of sight" in /etc/zfs/common.init (or whatever we name it).

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch 11 times, most recently from 2b24a71 to a572f98 Compare April 30, 2015 19:01
@FransUrbo
Copy link
Contributor Author

@behlendorf I've been tweaking this all day (and think I'm quite happy with everything now), so make sure you're using the right/latest commit.

@behlendorf
Copy link
Contributor

@FransUrbo things are definitely moving in the right direction. I've made some additional comments on the version from a few hours ago. The big remaining thing we need to work though is the logging. I've suggested one possible way to tackle this in the issue to get your feedback. But the current logging doesn't work at all under Redhat. There are a few other minor sharp corners I hit I commented on as well.

@FransUrbo
Copy link
Contributor Author

I've suggested one possible way to tackle this in the issue to get your feedback.

Looks good. I'll see if I can work something out that works for me and is as generic as I can make it, and I'll let you know when you can test/tweak it for your systems...
But the current logging doesn't work at all under Redhat.

Weird. I got it to work just fine on Fedora "something" like that… Ah, well. Your suggestion is probably better/cleaner anyway. I'll let you know.

@behlendorf
Copy link
Contributor

Great, just drop me a comment when you're happy with the next version. If you're setup to run virtual machines you could install CentOS 6 to test with.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch 2 times, most recently from 95c0812 to c9a1586 Compare May 1, 2015 14:29
@FransUrbo
Copy link
Contributor Author

@behlendorf Ok, I've done some modifications and tested on CentOS 6.0 and 6.6 and they now work great there (in addition to Debian GNU/Linux Wheezy and Jessie). I'm currently in the process of testing it on Fedora 17, 18 and 19. (don't know exactly when they went all-in on systemd, but Fedora 21 is infested with systemd so no point in testing it there).

@behlendorf
Copy link
Contributor

I'm all for additional testing but there's probably not much point in testing on Fedora. They've been using systemd since at least FC17. CentOS6 is the other major platform of interest so I'm glad you sorted that out. I should be able to check this out this weekend.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from c9a1586 to 0f0a36e Compare May 1, 2015 16:42

# bootmisc will log to /var which may be a different zfs than root.
before bootmisc logger
use mtab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use mtab is only needed for zfs-mount. It should be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx. Fixing.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

I haven't been able to spare enough time to fully digest this, but I think that my latest round of comments tackle the remaining technical concerns that I can spot without doing a more careful read. After they are addressed, merging this should be fine. If I find anything else after it is merged, I will open a pull request with a fix.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from 56b1163 to c2bcba0 Compare May 27, 2015 19:34
@behlendorf
Copy link
Contributor

@ryao thanks for looking this over. Yes, I agree. I'm sure there will be need of a little more adjusting after it's merged but we can do that as needed.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

An earlier verson of my comments suggsted using the need kewyord for OpenRC dependencies. I have edited my comments to suggest using the after keyword. Doing need is a double edged sword because if zfs-mount has a need dependency on procfs and procfs is restarted, zfs-mount will be restarted too, which is rather bad for systems with a ZFS rootfs.

@FransUrbo
Copy link
Contributor Author

@behlendorf @ryao Thanx for the comments. I've pushed a new version just now.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

Just a FYI, the documentation on OpenRC dependency semantics can be found on the Gentoo wiki:

https://wiki.gentoo.org/wiki/Handbook:X86/Working/Initscripts#Dependencies

@FransUrbo
Copy link
Contributor Author

@ryao Do you want me to change the need to after globally? It's really your call (it looks/sounds reasonable though), because I don't enough experience in Gentoo to be able to make that.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

@FransUrbo We should use after. If a user tries to restart one of the scripts we specify by need, they will likely trigger unexpected service restarts, which would be very bad for filesystem services. My apologies for providing the wrong keyword earlier.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from c2bcba0 to 2040d94 Compare May 27, 2015 19:44
@FransUrbo
Copy link
Contributor Author

@ryao @behlendorf Ok, I just pushed yet another version with the need to after changed.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

@FransUrbo Thanks.

rc-update add zfs-zed sysinit
rc-update add zfs-import sysinit
rc-update add zfs-mount boot
rc-update add zfs-share default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thought that is nagging me. I suspect that sysinit should be boot for consistency with Gentoo's LVM2 OpenRC script and basically everything other non-base system OpenRC script in Gentoo. However, it would be best to discuss these runlevel choices with @williamh in #openrc on freenode. He is the OpenRC maintainer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change it to boot. It makes more sense anyway.

The manual for rc-update mentioned sysinit, but not the binary it uses (much like Debian GNU/Linux update-rc.d, rc-update seems to be a wrapper script for a binary and the man page for that didn't mention sysinit and manually using it didn't work).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenRC does away with numbered runlevels of the sysvrc scripts in favor of named run levels with actual dependency calculations. Consequently, it is much more forgiving than sysvrc in terms of choices and while we probably could get away with just 1 runlevel, we have 4. Specifically, sysinit, boot, default and shutdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add that there is no sysinit binary for you to run. The /etc/inittab file is configured to start OpenRC, virtual terminals, serial lines and X (although there is a hack where an OpenRC script needs to kick it). Consequently, the services are started by scripts executed by sysvinit. The time intensive parts like dependency calculations are handled by binaries that were written in C. That said, it is possible to send signals to init via /sbin/init, but it is usually not needed.

@ryao
Copy link
Contributor

ryao commented May 27, 2015

I think I have dependency thoughts out of my system now. I can't think of anything else in that area.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from 2040d94 to ba34dd9 Compare May 27, 2015 20:00
@FransUrbo
Copy link
Contributor Author

@ryao @behlendorf Ok, another push with the README.md update.

@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from ba34dd9 to 0ff1cd4 Compare May 28, 2015 10:57
@FransUrbo
Copy link
Contributor Author

@behlendorf Ok, the minor issues fixed in new push.

* Based on the init scripts included with Debian GNU/Linux, then take code
  from the already existing ones, trying to merge them into one set of
  scripts that will work for 'everyone' for better maintainability.
  * Add configurable variables to control the workings of the init scripts:
    * ZFS_INITRD_PRE_MOUNTROOT_SLEEP
      Set a sleep time before we load the module (used primarily by initrd
      scripts to allow for slower media (such as  USB devices etc) to be
      availible before we load the zfs module).
    * ZFS_INITRD_POST_MODPROBE_SLEEP
      Set a timed sleep in the initrd to after the load of the zfs module.
    * ZFS_INITRD_ADDITIONAL_DATASETS
      To allow for mounting additional datasets in the initrd. Primarily used
      in initrd scripts to allow for when filesystem needed to boot (such as
      /usr, /opt, /var etc) isn't directly under the root dataset.
    * ZFS_POOL_EXCEPTIONS
      Exclude pools from being imported (in the initrd and/or init scripts).
    * ZFS_DKMS_ENABLE_DEBUG, ZFS_DKMS_ENABLE_DEBUG_DMU_TX, ZFS_DKMS_DISABLE_STRIP
      Set to control how dkms should build the dkms packages.
    * ZPOOL_IMPORT_PATH
      Set path(s) where "zpool import" should import pools from.
      This was previously the job of "USE_DISK_BY_ID" (which is still used
      for backwards compatibility) but was renamed to allow for better
      control of import path(s).
      * If old USE_DISK_BY_ID is set, but not new ZPOOL_IMPORT_PATH, then we
        set ZPOOL_IMPORT_PATH to sane defaults just to be on the safe side.
    * ZED_ARGS
      To allow for local options to zed without having to change the init script.
  * The import function, do_import(), imports pools by name instead of '-a'
    for better control of pools to import and from where.
    * If USE_DISK_BY_ID is set (for backwards compatibility), but isn't 'yes'
      then ignore it.
    * If pool(s) isn't found with a simple "zpool import" (seen it happen),
      try looking for them in /dev/disk/by-id (if it exists). Any duplicates
      (pools found with both commands) is filtered out.
      * IF we have found extra pool(s) this way, we must force USE_DISK_BY_ID
        so that the first, simple "zpool import $pool" is able to find it.
    * Fallback on importing the pool using the cache file (if it exists) only
      if 'simple' import (either with ZPOOL_IMPORT_PATH or the 'built in'
      defaults) didn't work.
  * The export function, do_export(), will export all pools imported, EXCEPT
    the root pool (if there is one).
  * ZED script from the Debian GNU/Linux packages added.
    * Refreshed ZED init script from behlendorf/zfs@5e7a660 to be portable so it
      may be used on both LSB and Redhat style systems.
    * If there is no pool(s) imported and zed successfully shut down, we will
      unload the zfs modules.
  * The function library file for the ZoL init script is installed as
    /etc/init.d/zfs-functions.
  * The four init scripts, the /etc/{defaults,sysconfig,conf.d}/zfs config file
    as well as the common function library is tagged as '%config(noreplace)' in
    the rpm rules file to make sure they are not replaced automatically if locally
    modifed.
  * Pitfals and workarounds:
    * If we're running from init, remove stale /etc/dfs/sharetab before importing
      pools in the zfs-import init script.
    * On Debian GNU/Linux, there's a 'sendsigs' script that will kill basically
      everything quite early in the shutdown phase and zed is/should be stopped
      much later than that. We don't want zed to be among the ones killed, so add
      the zed pid to list of pids for 'sendsigs' to ignore.
    * CentOS uses echo_success() and echo_failure() to print out status of
      command. These in turn uses "echo -n \0xx[etc]" to move cursor and choose
      colour etc. This doesn't work with the modified IFS variable we need to
      use in zfs-import for some reason, so work around that when we define
      zfs_log_{end,failure}_msg() for RedHat and derivative distributions.
  * All scripts passes ShellCheck (with one false positive in do_mount()).

Signed-off-by: Turbo Fredriksson [email protected]
Reviewed by: Richard Yao <[email protected]>
Reviewed by: Chris Dunlap <[email protected]>

Closes: openzfs#2974, openzfs#2107.
@FransUrbo FransUrbo force-pushed the fixes.init_total_rewrite branch from 0ff1cd4 to 768ba60 Compare May 28, 2015 13:50
@behlendorf
Copy link
Contributor

@FransUrbo thanks. OK, I think the existing scripts can be merged. They've pass my basic functional testing on CentOS 6 and if @FransUrbo and @ryao are happy with them I'll merge it.
I'm sure we'll find some minor issues here and there but we can tackle them as they get exposed.

@FransUrbo
Copy link
Contributor Author

I'm very happy with them at this point. And quite sic of them :). Considering how much we've tweaked them, I doubt there's much if anything left to do :D.

@behlendorf
Copy link
Contributor

Merged as:

2a34db1 Base init scripts for SYSV systems

@behlendorf behlendorf closed this May 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zed daemon has no init.d script
6 participants