Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs-load-key.sh will loop forever if loading module failed #13308

Closed
nabijaczleweli opened this issue Apr 8, 2022 · 6 comments
Closed

zfs-load-key.sh will loop forever if loading module failed #13308

nabijaczleweli opened this issue Apr 8, 2022 · 6 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@nabijaczleweli
Copy link
Contributor

nabijaczleweli commented Apr 8, 2022

System information

Type Version/Name
Distribution Name Debian
Distribution Version today's sid
Kernel Version 5.15.0-3-whatever
Architecture amd64
OpenZFS Version 2.1.4

Describe the problem you're observing

I upgraded DKMS (and fiddled with the configuration) and to 2.1.4 – this had the unfortunate effect of generating an initrd with unsigned modules, which I neglected to verify. With module.sig_enforce=1, this meant that zfs-load-module.service failed with

modprobe: ERROR: could not insert 'zfs': Key was rejected by service

This, then, meant that

while [ "$(zpool list -H)" = "" ]; do
    systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && exit 1
    sleep 0.1s
done

spewed

The ZFS modules are not loaded.
Try running '/sbin/modprobe zfs' as root to load them.

every 0.1s onto the console, because both import services were inactive.

This was very fun to debug.

Describe how to reproduce the problem

As above.

Ideally, we'd just do systemctl is-active -q zfs-load-module.service || exit 1 in that loop, but, uh, that's a debian patch (which, funnily enough, we actively sanction since 5eae5a8). So [ -e /sys/module/zfs ] || [ -e /sys/fs/zfs ] || exit 1, then. This will ensure we fail gracefully.

But rather than do that, why aren't we.. just loading the module from userspace? We do it in systemd/zram-generator and it's fine (hell, expected, even! it was quite shit before we started to). We do it on FreeBSD and it's fine. Honestly, not loading it is kinda weird.

@nabijaczleweli nabijaczleweli added the Type: Defect Incorrect behavior (e.g. crash, hang) label Apr 8, 2022
@nabijaczleweli
Copy link
Contributor Author

actually, that's stupid, we're much better off just doing

while l="$(zpool list -H)" && [ -z "$l" ]; do

which will catch all problems of the sort

@nabijaczleweli
Copy link
Contributor Author

Hm, actually, the

while ! systemctl is-active --quiet zfs-import.target; do

condition from #13291 is sufficient for this, since it depends on both import services, which are skipped and therefore fulfilled, zfs-env-bootfs which always succeeds, and zfs-load-module which fails, so it's in active state. Since I plan on backporting it wholesale I'll just mark it as closes-this and we're g2g.

@zfsbot
Copy link

zfsbot commented Apr 8, 2022

why aren't we.. just loading the module from userspace

we used to - actually, it needed modprobe zfs sprinkled all over dracut scripts when the tools stopped loading the zfs modules.

looking back, the change was introduced here and claimed "things will just work", which is quite interesting now in retrospect.

@nabijaczleweli
Copy link
Contributor Author

I mean, in looking through the journal to debug this I did actually see that udev was loading the module first (and failing for reasons mentioned, which lead to an udev: nvme0n1p2: ERROR: ...), so, IME, they do Just Work.. except when you don't have a fstype=zfs device plugged (and they did work in this case).

Thanks for the issue ref – I was actually under the impression that we don't have auto-loading (I only skimmed that file before), but it appears we do, except it's off by default and needs ZFS_MODULE_LOADING=on to be set, which is, hm, suboptimal? Because it's undocumented and you want to load the module if you run the userspace tools? And it doesn't look like it handles having a built-in module well? Def. some room for improvement there.

@zfsbot
Copy link

zfsbot commented Apr 8, 2022

yeah kinda hard to get that env var if you don't have an initrd or something, too.

by "used to" i mean the binaries used to load the kernel module by exec'ing modprobe from inside the zpool/zfs binaries

@nabijaczleweli
Copy link
Contributor Author

Fixed by #13291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants