-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround to avoid a race when /var/lib is a persistent dataset #9360
Conversation
If /var/lib is a dataset not under <pool>/ROOT/<root_dataset>, as proposed in the ubuntu root on zfs upstream guide (https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS), we end up with a race where some services, like systemd-random-seed are writing under /var/lib, while zfs-mount is called. zfs mount will then potentially fail because of /var/lib isn't empty and so, can't be mounted. Order those 2 units for now (more may be needed) as we can't declare virtually a provide mount point to match "RequiresMountsFor=/var/lib/systemd/random-seed" from systemd-random-seed.service. The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required. Example: - rpool/ROOT/ubuntu (mountpoint = /) - rpool/var/ (mountpoint = /var) - rpool/var/lib (mountpoint = /var/lib) Both zfs-mount.service and systemd-random-seed.service are starting After=systemd-remount-fs.service. zfs-mount.service should be done before local-fs.target while systemd-random-seed.service should finish before sysinit.target (which is a later target). Ideally, we would have a way for zfs mount -a unit to declare all paths or move systemd-random-seed after local-fs.target. Signed-off-by: Didier Roche <[email protected]>
it is bad idea move /var/lib to separate dataset - it contain many important data like dpkg info(just for example DPKG+APT based distributions) and it should be in sync with distribution version. |
@ikozhukhov: you are right that moving all /var/lib in a separate dataset is a bad idea on its own. I hope this helps shading some lights :) |
Why can't you do Edit: I see that you said, "The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required." While the mount generator is not required in general, if you're doing root-on-ZFS, you really should enable it (once on ZFS 0.8), and e.g. the Buster HOWTO has steps for that. Hopefully, you're enabling that in the installer for 19.10. For 18.04, that's why I have the work-around of mounting certain things in The change from this PR isn't particularly harmful. We wouldn't want to try to add too many of those orderings, though, or it would get to be a mess. |
@rlaager: Some of our users won't use the generator as it's optional, and so, will potentially ends up with a broken on boot system. |
My comment edit may have crossed with your comment (and I'm not sure if edits send emails). |
Right, if they are using the mount generator, this would work as the .mount generated unit will have it. |
@didrocks if you can see this point on recommended page - they are have mistake and not using different BE(Boot Environment) for ability to boot to different system versions. it is issue of product design and should be fixed - but it is up to distribution - how they wants to manage it. |
Codecov Report
@@ Coverage Diff @@
## master #9360 +/- ##
==========================================
+ Coverage 79.06% 79.3% +0.24%
==========================================
Files 401 401
Lines 122495 122495
==========================================
+ Hits 96845 97144 +299
+ Misses 25650 25351 -299
Continue to review full report at Codecov.
|
In the long run, doing it this way will certainly become a nightmare (and we should try to eventually move to the new generator framework). But we're not there yet, and this looks like the right mitigation. @didrocks didrocks I'm curious about this comment:
Can you please open an issue (or point me to it) with more information on this? I'd like to make sure that zfs-mount-generator actually works well. |
It would be great if someone could open an issue detailing what would be required to get there. |
@behlendorf: thanks for merging! @aerusso: I'll do a round of testing with the various cases that are currently broken with the current generator and open separate issues for each of them (probably by the end of next week). |
If /var/lib is a dataset not under <pool>/ROOT/<root_dataset>, as proposed in the ubuntu root on zfs upstream guide (https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS), we end up with a race where some services, like systemd-random-seed are writing under /var/lib, while zfs-mount is called. zfs mount will then potentially fail because of /var/lib isn't empty and so, can't be mounted. Order those 2 units for now (more may be needed) as we can't declare virtually a provide mount point to match "RequiresMountsFor=/var/lib/systemd/random-seed" from systemd-random-seed.service. The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required. Example: - rpool/ROOT/ubuntu (mountpoint = /) - rpool/var/ (mountpoint = /var) - rpool/var/lib (mountpoint = /var/lib) Both zfs-mount.service and systemd-random-seed.service are starting After=systemd-remount-fs.service. zfs-mount.service should be done before local-fs.target while systemd-random-seed.service should finish before sysinit.target (which is a later target). Ideally, we would have a way for zfs mount -a unit to declare all paths or move systemd-random-seed after local-fs.target. Reviewed-by: Antonio Russo <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Didier Roche <[email protected]> Closes openzfs#9360
If /var/lib is a dataset not under <pool>/ROOT/<root_dataset>, as proposed in the ubuntu root on zfs upstream guide (https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS), we end up with a race where some services, like systemd-random-seed are writing under /var/lib, while zfs-mount is called. zfs mount will then potentially fail because of /var/lib isn't empty and so, can't be mounted. Order those 2 units for now (more may be needed) as we can't declare virtually a provide mount point to match "RequiresMountsFor=/var/lib/systemd/random-seed" from systemd-random-seed.service. The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required. Example: - rpool/ROOT/ubuntu (mountpoint = /) - rpool/var/ (mountpoint = /var) - rpool/var/lib (mountpoint = /var/lib) Both zfs-mount.service and systemd-random-seed.service are starting After=systemd-remount-fs.service. zfs-mount.service should be done before local-fs.target while systemd-random-seed.service should finish before sysinit.target (which is a later target). Ideally, we would have a way for zfs mount -a unit to declare all paths or move systemd-random-seed after local-fs.target. Reviewed-by: Antonio Russo <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Didier Roche <[email protected]> Closes openzfs#9360
If /var/lib is a dataset not under <pool>/ROOT/<root_dataset>, as proposed in the ubuntu root on zfs upstream guide (https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS), we end up with a race where some services, like systemd-random-seed are writing under /var/lib, while zfs-mount is called. zfs mount will then potentially fail because of /var/lib isn't empty and so, can't be mounted. Order those 2 units for now (more may be needed) as we can't declare virtually a provide mount point to match "RequiresMountsFor=/var/lib/systemd/random-seed" from systemd-random-seed.service. The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required. Example: - rpool/ROOT/ubuntu (mountpoint = /) - rpool/var/ (mountpoint = /var) - rpool/var/lib (mountpoint = /var/lib) Both zfs-mount.service and systemd-random-seed.service are starting After=systemd-remount-fs.service. zfs-mount.service should be done before local-fs.target while systemd-random-seed.service should finish before sysinit.target (which is a later target). Ideally, we would have a way for zfs mount -a unit to declare all paths or move systemd-random-seed after local-fs.target. Reviewed-by: Antonio Russo <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Didier Roche <[email protected]> Closes #9360
Motivation and Context
There is a race when /var/lib is a separate dataset not under /ROOT/<root_dataset> between systemd-random-seed which creates/refresh a stamp file and zfs mount -a which will try to mount on /var/lib which isn't empty.
Description
If /var/lib is a dataset not under /ROOT/<root_dataset>, as proposed in the ubuntu root on zfs upstream guide (https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS), we end up with a race where some services, like systemd-random-seed are writing under /var/lib, while zfs-mount is called. zfs mount will then potentially fail because of /var/lib isn't empty and so, can't be mounted.
Order those 2 units for now (more may be needed) as we can't declare virtually a provide mount point to match "RequiresMountsFor=/var/lib/systemd/random-seed" from systemd-random-seed.service.
The optional generator for zfs 0.8 fixes it, but it's not enabled by default nor necessarily required.
Example:
Both zfs-mount.service and systemd-random-seed.service are starting After=systemd-remount-fs.service. zfs-mount.service should be done before local-fs.target while systemd-random-seed.service should finish before sysinit.target (which is a later target).
Ideally, we would have a way for zfs mount -a unit to declare all paths or move systemd-random-seed after local-fs.target.
The fix changes the unit ordering. I'm opened to any better option though if you can think of one.
How Has This Been Tested?
This has been tested on ubuntu eoan (incoming 19.10) with the previous layout as descibed. The change is minimal and now the unit ordering is reproducible.
Types of changes
Checklist:
(No documentation to update AFAIK)
(No tests on that part)
Signed-off-by
.