-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash during zfs activity (scrub, rsync. etc) #404
Comments
Thanks for the bug report and going through the effort to get the console stacks. They'll make it possible to run this issue to ground. |
Quick update: I can confirm that using a gentoo-sources (as opposed to hardened-sources) kernel does not fix the problem. While I have not fully switched over to the non-hardened toolchain (this takes ages), I set gcc's profile to 'vanilla' with gcc-config, and rebuilt both the kernel and the spl/zfs packages with it and am presuming this is sufficient to remove the effect of the hardened system. If anyone knows otherwise, please chime in. Is it worth trying with vanilla-sources/git-sources? I'm under the impression that gentoo's patches are minimal, but don't know any detail. Unfortunately, despite having crashed it three times since changing kernel, netconsole did not send any output (the system does remain "up" for a while, and continues other duties like being a wireless AP and returning ping, but all SSH connections freeze), so I have nothing to paste here yet. I'll keep trying. Any hints with regard to capturing crash traces with netconsole are welcome :) |
There is a decent change this was fixed by issue #279. Can you try the 0.6.0-rc6 sources (or master) which contains the fix. |
Thanks for the heads-up. Rebuilding from master appears to have fixed the issue! (I've been running daily rsyncs and shapshot sets since the weekend and haven't observed a crash). However, I do still get crashes with the new code running on the hardened kernel, so I believe I was actually suffering from two issues at once. I'm perfectly happy to use the vanilla kernel though. Thanks a lot for your help. Heh, I seem to have a machine that really brings out the worst in ZFS! Now that the basic stuff seems to be working, running zfs send (either redirecting to a file or piping to ssh) quite repeatably locks the machine up solid in about two seconds. No debug output yet. I'll try and get some somehow and open a new issue. |
OK, well I'm glad this fixed you first issue. Please go ahead and open new bugs for the other issues you've seen and we'll get them fixed as well. A stack from the console during the crash is always helpful as are simple reproducers. |
Follow up from openzfs#404 and openzfs#422 where we used an iterator fold() where we really wanted to use reduce().
Hi all,
I'm getting crashes when ZFS is under load. It takes a random time happen, ranging from almost immediately after loading the zfs module to over a day.
I'm running zfs-9999 from Pendor's overlay (built since rc5 was tagged, but I'm not sure of the revision) on Gentoo Hardened 2.6.39 (amd64) on a system with 1.5G of ram, and have a single-disk zpool with a 1T Samsung disk. I cannot swear it's related, but I think this crash only started happening when I moved to the Hardened kernel/userland.
I load the zfs module with a restricted ARC size of 512MB and while the system is running I see memory usage by ZFS varying a lot but seemingly stable.
I managed to get the following via netconsole:
After this appears, the system grinds to a halt over a minute or two and needs a hard reset.
I apologise if this is a duplicate of an existing issue - I was unable to tell. Please let me know if you need any more output.
The text was updated successfully, but these errors were encountered: