Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Kernel oops when trying to write an image to a zvol #339

Closed
gcbirzan opened this issue Mar 26, 2014 · 8 comments
Closed

Kernel oops when trying to write an image to a zvol #339

gcbirzan opened this issue Mar 26, 2014 · 8 comments
Labels
Milestone

Comments

@gcbirzan
Copy link

On a Unbutu saucy with a trusty kernel (and zfs from the trusty repo), when trying to write a disk image (a Ubuntu install) to a zvol, the module crashes and requires a reboot to get access to the zpool: https://gist.github.com/gcbirzan/9778381

I cannot share the image since it contains proprietary stuff, but the crash is reproducible so we can try fixes. Also, after the initial crash, the VM we started on that block device is working happily.

The kernel is 3.13.0-19-generic from Ubuntu's repos with a custom patch that is completely unrelated (a KVM fix, but at the point where it crashes, we didn't have a VM, plus, we've been using the patch - on older kernels - for more than a year without any issues), and the module is 0.6.2-2trusty3.gbp9888b6 built from 9888b652c35b794597c8695aa8ccb5dccf78fe76

@dweeezil
Copy link
Contributor

@gcbirzan The problem is that kthread_create is returning -12 which is ENOMEM (out of memory). It looks like we should do better error checking in taskq_create().

@dweeezil
Copy link
Contributor

@gcbirzan Feel free to try dweeezil/spl@357e01a which is a patch to deal with this error condition.

@satmandu
Copy link

Is this possibly related to the out of memory issue I'm also seeing with 3.13 here? openzfs/zfs#2143

@gcbirzan
Copy link
Author

The patch didn't fix it: https://gist.github.com/gcbirzan/9828738

@dweeezil
Copy link
Contributor

@gcbirzan That one is a bit different, taskq_destroy() is being passed a NULL pointer. Unfortunately, throughout ZFS, not all the users of taskq_create() are prepared to handle a NULL return. I'm going to re-work the patch to re-try forever as suggested by @behlendorf.

@dweeezil
Copy link
Contributor

@gcbirzan I just worked up dweeezil/spl@f148f72 to handle this better and also to deal with the other major user of kthread_create(). I don't have my 3.13 build environment laying around so I was only able to give it some light testing under 3.8 (Ubuntu's stock 3.8.0-22-generic kernel).

@gcbirzan
Copy link
Author

This fixed the problem.

@dweeezil
Copy link
Contributor

@gcbirzan Thanks for testing. I was finally able to do a bit of testing under 3.13 and added a little more instrumentation to see how many times the kthread_create failed. In my little bit of testing, it never did fail, however, you've clearly got an environment in which it was failing and needed to be retried.

Hopefully this or something like it can get committed soon because it looks like a bunch of the major distros are starting to use 3.13. I'll also note that so far, it looks like 3.14 will need the same treatment.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants