Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hung task when trying to write an image to a zvol #2230

Closed
gcbirzan opened this issue Apr 1, 2014 · 4 comments
Closed

Hung task when trying to write an image to a zvol #2230

gcbirzan opened this issue Apr 1, 2014 · 4 comments
Milestone

Comments

@gcbirzan
Copy link

gcbirzan commented Apr 1, 2014

Similar to openzfs/spl#339, after applying dweeezil/spl@f148f72, we now get:

https://gist.github.com/gcbirzan/b14bc85234ab5d7c1e8b

@tuxoko
Copy link
Contributor

tuxoko commented Apr 2, 2014

As I've commented in openzfs/spl#331
The problem is that kthread_create failed because it received a SIGKILL, not because of insufficient memory.

The patch by @dweeezil will cause the thread to stuck repeatedly calling kthread_create and failed because the SIGKILL is not surved.

You should try to find out why the thread is receiving SIGKILL.

@dweeezil
Copy link
Contributor

dweeezil commented Apr 6, 2014

@gcbirzan I just pushed dweeezil/spl@0d36f37 to deal with the interrupted kthread_create() case. What was the condition under which your failure referenced above occurred? Was it during boot-time import? My patch will cause the process to be properly interrupted, but the question is why your process was being interrupted in the first place. Apparently systemd can send signals to various processes during startup if they take too long and I'm wondering if that's your situation.

@gcbirzan
Copy link
Author

gcbirzan commented Apr 7, 2014

We've experienced this problem both when importing a drive (though, not as often as the first one, which was 100% sure), and when trying to destroy a volume after having dedup on the pool. The latter ended up with the pool being a write-off, since it just freezes the machine after a while (it runs out of memory and at some point just freezes). All of this only happens on 3.13, we have a 3.5 machine we tried the same on and it works flawlessly so far (no dedup though, because fuck dedup).

We're, however, planning to go live with this in the near future, so not sure how long we have for testing, but I'll try to apply your patch and import the broken pool, but I'm fairly confident (considering we've seen this on 2 separate machines) that it's relatively easy to reproduce.

@behlendorf
Copy link
Contributor

This issue was resolved by openzfs/spl#339.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants