-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.6.4 Hardlock #3349
Comments
I'm experiencing similar problem on Ubuntu 12.04 (custom kernel: 3.14.27-031427-generic), however Ubuntu 14.04 (3.13.0-49-generic) works just fine. |
@jalavoy It says CPU 1, 6, 11, 14 stalls. |
@tuxoko it never recovers. If I stop IO (put all VM's to sleep and stop all services) the load will stop growing, however it never catches back up. If it happens while I'm sleeping and I'm unable to do that, it'll eventually balloon until the machine OOMs. |
@jalavoy |
Compiled and running now. It typically takes a few days to break, so I'll update when/if it happens again. |
It looks like your patch might have fixed it @tuxoko - No issues yet. Unfortunately I'm going out of town tomorrow and wont be able to monitor the situation for a couple weeks. I'm comfortable with marking this as closed once we get that patch applied. I can always re-open if it pops up again. |
oops, it'd help if I let you guys weigh in on it before closing. Apologies. |
@tuxoko could you open a pull request with the proposed fix. It does look like we could spin here under certain circumstances so adding a |
In the interest of time I made a pull request at #3361 I'm not trying to steal any credit for this, If this is rude/improper in anyway please accept my apology and decline it and we can wait for tuxoko. |
@jalavoy it isn't rude or improper. In fact we all appreciate people who help move things forward. You should, however, set the authorship of the git commit to attribute it properly.
|
@nedbass Thanks, didn't know I could do that. I think I've done it properly now, although the commit log is a little messy. If I need to wipe this out and do it again let me know. |
I think it's fine for purposes of review and buildbot testing. Normally there shouldn't be a merge commit. |
Works for me. Thank you all very much. |
…xoko" This reverts commit 71a8e9b.
It's been reported that threads would loop infinitely inside zfs_zget. The speculated cause for this is that if an inode is marked for evict, zfs_zget would see that and loop. However, if the looping thread doesn't yield, the inode may not have a chance to finish evict, thus causing a infinite loop. This patch solve this issue by add cond_resched to zfs_zget, making the looping thread to yield when needed. Tested-by: jlavoy <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#3349
It's been reported that threads would loop infinitely inside zfs_zget. The speculated cause for this is that if an inode is marked for evict, zfs_zget would see that and loop. However, if the looping thread doesn't yield, the inode may not have a chance to finish evict, thus causing a infinite loop. This patch solve this issue by add cond_resched to zfs_zget, making the looping thread to yield when needed. Tested-by: jlavoy <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#3349
It's been reported that threads would loop infinitely inside zfs_zget. The speculated cause for this is that if an inode is marked for evict, zfs_zget would see that and loop. However, if the looping thread doesn't yield, the inode may not have a chance to finish evict, thus causing a infinite loop. This patch solve this issue by add cond_resched to zfs_zget, making the looping thread to yield when needed. Tested-by: jlavoy <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3349
Updates ZFS and SPL to latest maintence version. Includes the following: Bug Fixes: * Fix panic due to corrupt nvlist when running utilities (openzfs/zfs#3335) * Fix hard lockup due to infinite loop in zfs_zget() (openzfs/zfs#3349) * Fix panic on unmount due to iput taskq (openzfs/zfs#3281) * Improve metadata shrinker performance on pre-3.1 kernels (openzfs/zfs#3501) * Linux 4.1 compat: use read_iter() / write_iter() * Linux 3.12 compat: NUMA-aware per-superblock shrinker * Fix spurious hung task watchdog stack traces (openzfs/zfs#3402) * Fix module loading in zfs import systemd service (openzfs/zfs#3440) * Fix intermittent libzfs_init() failure to open /dev/zfs (openzfs/zfs#2556) Signed-off-by: Nathaniel Clark <[email protected]> Change-Id: I053087317ff9e5bedc1671bb46062e96bfe6f074 Reviewed-on: http://review.whamcloud.com/15481 Reviewed-by: Alex Zhuravlev <[email protected]> Tested-by: Jenkins Reviewed-by: Isaac Huang <[email protected]> Tested-by: Maloo <[email protected]> Reviewed-by: Oleg Drokin <[email protected]>
I've been having this issue since going to 0.6.4 and haven't been able to collect any useful information up until this point.
Basically every couple of days, typically in the middle of the night (although not always, it seems fairly random), ZFS will cause the CPU usage to skyrocket and all I/O on the pool stops. I finally got some dumps that might be useful (i sure hope so anyway).
If there is any other information you need please feel free to ask. Since I can't seem to recreate this reliably other than waiting for it, I hope this is enough. If it's useful at all, this is using 3.14.39, but I've replicated it with 3.18.x kernels and several other 3.14.x.
The text was updated successfully, but these errors were encountered: