-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
centos 7: dkms strikes again #3801
Comments
For what it's worth, this one has bitten me, too. I'll be happy to test any fixes when they're ready. |
I think I'm going to try and see if mountpoint=legacy works because the real problem here is systems coming up without zpool. |
.. and that doesn't work so we're pretty much screwed. The only known workaround at this point is to run dkms install -k kernel.version before reboot and hope for the best. |
I'm not sure why this isn't being triggered for CentOS 7. I'd be happy to apply a fix if someone has the time to look in to it and determine why. That said, I may have a better solution available for CentOS 6/7 users. My suggestion would be to switch to the KMOD repository. These are binary packages for CentOS with weak-modules support which rely on the stable kABI. You can install them once and they will work with any of the stock CentOS kernels, no need to rebuild. You can install them from the official repository but I'd suggest cleanly removing the DKMS version first. As always let me know if you hit any rough edges. ZFS still uses a few more symbols than exist in the stable kABI so it's possible things might break if one of those change. This is the main reason I didn't mention the KMOD repository before. However, thus far I haven't observed any issues due to this.
|
At a guess you might have a race condition during yum install: zfs is updated before the kernel so modules don't get built. It doesn't explain why they don't get built on boot -- that used to work at least sometimes though I don't remember seeing that on centos 7 (but definitely happened on centos 6). I'll give the kmod repo a try once I figure out which of the systems I can break safely. ;) |
Would it be an idea to build some verification scripts and run those on system shutdown and/or reboot? I've had the same issue on Fedora 21/22 |
As a workaround until someone writes a fix for this, dkms install could likely be run manually:
|
Just wanted to note that I run the kernel-ml (mainline) package from elrepo due to lack of hfsplus module in the centos 7 kernel, so the kmod packages won't work for me. When the problem hit me, I was too lazy to investigate at the time, and basically did what ryao suggests: yum reinstall spl && yum reinstall zfs. |
|
Could it be a workaround to upgrade the kernel-packages and reboot prior upgrading zfs/spl ? |
just for posterity: one one centos 7 machine so far I
Everything seems to be in order so I removed dkms, gcc, and the rest of the -devel stuff. |
A fix for the CentOS dkms packages was applied to both 0.6.5.2 and the master branch. It addresses the case where old DKMS builds we're not being correctly cleaned up. This could have resulted in some of the reported issues. |
I need help... I think this issue has shut my system down. I must have updated and not rebooted. My machine got hung, and when I rebooted It refuses to load to the current kernal. 3.10.0-229.14.1.el7.x86_64 the only kernal I can boot into is 3.10.0-123.el7.x86_64 The other issue is that I have nvidia drivers installed which won't allow me to boot into a gui... I did: I installed: but it still refuses to boot. and zfs refuses to work... |
@jonathandgough You may want to clarify "refuses to boot". You should be able to boot wihout ZFS fine if you don't have any system-critical filesystems on ZFS. With yum, you only need to reinstall spl-dkms and zfs-dkms. Note that this will install them in the running kernel. If you're trying to reinstall them on -123 and then boot into -229.14.1, it won't work. If you want to do that, you will have to use the dkms command mentioned above: dkms install spl/ -k Don't know much know about your nvidia problem (and it's a bit offtopic here), as there are many different ways to install nvidia drivers. However, if the nvidia kernel module isn't loaded, the nvidia X11 driver won't work. If you installed an nvidia driver package with dkms support, you should be able to do the same as above. If not, you may need a newer package for your newer kernel, or if it is version-independent, to do a reinstall of the driver after booting into the new kernel. |
if you can boot into single-user,
should fix ya. FWIW I'm getting nvidia drivers from elrepo and now zfs: from |
So.... I pulled my nvidia graphics card, and that (for whatever random reason) allowed me to actually see the boot screen. And, indeed the startup is getting hung up at zfs. It is saying "a start job is running for Mount ZFS filesystems. and it basically just hangs there. the only way in is to boot through the rescue mode. In rescue mode I deleted zfs then booted into the latest kernal and reinstalled as suggested. dkms install spl/0.6.5.2 -k 3.10.0-229.14.1.el7.x86_64 but I got errors and my partition will not mount. for installing spl I got: depmod... for installing zfs I got: zavl.ko:
znvpair.ko: zunicode.ko: zcommon.ko: zfs.ko: zpios.ko: depmod... DKMS: install completed. |
@dmaziuk where are you getting zfs-kmod from? |
To be clear you must use either the KMOD or DKMS repository, not both. Before switching definitely make sure old packages are removed and any stale modules. |
I haven't done anything with a KMOD, I'm just at my wits end. I have managed to get my machine to boot. i have tried removing everything and then re-installing but I when I check zpool status, I still keep getting: I even removed everything zfs related and re-installed everything from the archive as explained on the webpage. |
Check the output of |
@jonathandgough OK, here's what worked for me so far (5 or 6 boxen, one centos 6, the rest's centos 7, no nvidia drivers):
|
The problem is it's not finding zavl.ko From what I can surmise, when i was trying to clean things up I deleted these from the old location and when it is re-installing (and uninstalling) it's looking for them in the old location. Problem is, is they are gone... I tried using dkms install spl/0.6.5.2 -k 3.10.0-229.14.1.el7.x86_64 --force but that didn't work... now trying to force install zfs on 3.10.0-229.1.2.el7.x86_64 but i'm getting the same errors. |
CentOS 7 on kernel 3.10.0-229.14.1.el7.x86_64. Installed 0.6.5.3 today and zfs module was not able to be loaded. Tried everything in this thread and nothing worked except uninstalling and using zfs-kmod from above to install. |
@behlendorf @russ0519 On CentOS 7, when I install kmod-zfs with kernel-3.10.0-229.11.1 and kernel-3.10.0-229.14.1 installed, the modules only show up under the earlier kernel, even if it's not the actively booted one. Switching back to DKMS... |
I threw this into /etc/cron.daily/zfs-update-dkms on CentOS 7 (7.2.1511). Obviously if you reboot immediately after updating kernels this won't help unless you remember to run it manually. #!/bin/bash
set -eu
# Set this to something high to get modules built for all of your kernels
# or 1 to just build for the most recent.
N_KERNELS=10
SPL_VER=$(rpm -qa --qf "%{VERSION}\n" spl-dkms | sort -V | tail -n1)
ZFS_VER=$(rpm -qa --qf "%{VERSION}\n" zfs-dkms | sort -V | tail -n1)
for KERNEL_VER in $(rpm -qa --qf "%{VERSION}-%{RELEASE}.%{ARCH}\n" kernel | sort -V | tail -n${N_KERNELS})
do
dkms install -q spl/${SPL_VER} -k ${KERNEL_VER}
dkms install -q zfs/${ZFS_VER} -k ${KERNEL_VER}
done |
Using some tips above, this is how I approach a yum upgrade that includes both ZFS and kernel packages:
Something like it might work well in concert with your crontab. I suppose whether or not there are ZFS upgrades, you would need to manually run your script after the commands I list above. |
On vanila new install of centos 7.0-1406 (3.10.0-123.el7.x86_64),
I am surprised ! I've been using zfs with centos 6 for almost 2 years without any problems. Would like to promote ZFS strongly in my company but this does not look "enterprise solid" if clean install does not even work, let alone upgrade. |
FWIW I've been using Since redhat's already invested a lot into |
On vanila new install of centos 7.0-1406 (3.10.0-123.el7.x86_64), However, a nice surprise,
Posted here as reference that this specific combination of packages/versions seem to work together. |
In the previous comment I installed kmod-zfs on old kernel, then upgrade the kernel. This time I upgrade the kernel first, then install kmod-zfs. Happy to report that kmod-zfs clean install with latest kernel 3.10.0-123.el7.x86_64 also works ( So I am ditching DKMS for KMOD now. |
Just a "me too", but after a
Had everything up and running. |
aaaand again.... Upgrading to kernel 4.4.5-301 in FC23 got me a failing ZFS system (it wasn't automatically upgraded). |
Just wanted to share my process here in case it helps anyone else. I read through this and a couple other threads and came away with the following to get DKMS working again after a kernel update.
You should see a list of spl and zfs kernel modules compiled for each kernel. The important info here is the version number. At the time of this writing it was: 0.6.5.8 Remove the DKMS kernel modules
Clean out the other ZFS related modules. Note that this is dangerous if you have other kernel modules, so double check these directories. When I ran it I had the following (all ZFS related):
Reinstall ZFS and SPL
Readd and reinstall the DKMS moudles (note the version number from above)
After that, you will need to load the modules
Now, you should be able to run:
|
Since all of the above commands work in an RPM, why is it that this still matters to this day? Why don't we either do these things in the RPMs after detecting dkms is present, or just stop having two different ways of managing updates because of "kernel modules"? What's the missing technology here to make this a seamless detail for users? It seems really odd that all of this manual process exists in this day and age. These utility applications can be made smarter and if needed more meta-data made available to them so that they can work seamlessly with the users environment. Just to clarify this more, the kernel install RPM should provide some sort of trigger that DKMS and other kernel sensitive components can use. It should be possible to pre-build new modules for new kernels before having to reboot. In the ZFS world, new kernel installs could even be done with new snapshot preserved configurations so that the user can always know what goes together and have everything preserved. There is just so much that ZFS or any other transactional filesystem can provide to the management of the operating system update mechanism. If we could just get ZFS fixed to deal with full file systems completely without locking up, that would be even better. |
We might be able to trigger these rebuilts on startup, with some kind of systemd service which verifies whether it's required to rebuilt the module for the current kernel. |
I upgraded from CentOS 7.2 to 7.3 which broke all of ZFS. For what it is worth, the dominic-p mention works perfectly. |
I'm glad to hear the notes worked for you. In case it will help anyone, I put together a simple script to automate the rebuild process after a kernel update. Use it with caution! It runs some dangerous commands (e.g. |
Thanks dominic-p I will definitely try that on the next update. Would I use this before or after rebooting? Or would it matter? Would be nice to have a solution where upon upgrading of zfs, and spl and/or kernel update it would just rebuild the module. Wouldn't the simplest way be to kill/disable weak updates? |
This procedure is meant to be run after reboot (once the new kernel is running). I honestly don't know very much about DKMS, so I'm not sure if disabling weak updates would solve the problem. If you make any progress on that, please post back. |
Going to try SIGBUS's solution from this post: |
I had the same issue with 0.6.5.9 None of the solution listed above actually works. The kmod-zfs repository is not working with my kernel. The repomd.xml file is not valid. When running dkms install, The build step of the zfs failed. dkms install -m zfs -v 0.6.5.8 -k `uname -r`
[...]
make -j24 KERNELRELEASE=3.10.0-693.5.2.el7.x86_64...(bad exit status: 2)
Error! Bad return status for module build on kernel: 3.10.0-693.5.2.el7.x86_64 (x86_64)
Consult /var/lib/dkms/zfs/0.6.5.9/build/make.log for more information. My kernel is My log output is here Any help would be useful... Best regards, edit: finally it works by removing any kernel module with dkms and then removing dkms, spl and all zfs packages. Then, reboot, and go here to reinstall it like if you were doing it from scratch using kmod. |
Again, what set of dependency checks and technology keep this from being just implemented to work correctly, for DKMS or not? This really feels like something a freshmen student at university threw together for some friends. It should be a professionally behaving installation that just works. Every variation and detail discussed on this page seems like something readily handled by an RPM or at least a shell script. Is there really just no experience here with automation of things like this? |
Closing. This issue is being resolved in 0.8 by moving the spl source in to the zfs repository to eliminate the dependency. |
This is consistent on all our c7 boxen: kernel update kernel-3.10.0-229.14.1.el7.x86_64 does not trigger a dkms rebuild during install or subsequent reboot. Whether zfs/spl rpms are updated at the same time or not.
A related problem is it doesn't throw you down to single-user login when zfs pool fails to init, like when an fstab failsystem fails to mount. Then init will happily try to start daemons that use your zfs filesystems.
The text was updated successfully, but these errors were encountered: