-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock/lockup under load #414
Comments
Simillar problem here, running kubuntu 11.04 with 2.6.38-11 kernel, zfs from ubuntu repo - 0.6.0.33, 4Gb RAM. System hangs on disk load (downloading torrent) with this in syslog: Oct 9 04:50:13 drcomp kernel: [ 9715.172576] TCP: Possible SYN flooding on port 6881. Sending cookies. |
After upgrade to kubuntu 11.10 with kernel 3.0.x and zfs 0.6.0.34 I don't have this problem any more... |
This is believed to be resolved in the latest code. Please reopen if you observe the failure again. |
The core motivation behind these changes is to minimize the memory management differences between ZFS on Linux and other platforms. This simplifies the process of porting changes to Linux from other platforms. This is good for code quality and is expected to reduce the number of defects accidentally introduced due to porting. The key reason this is now possible is due to the addition of Linux features such as the thread-specific PF_FSTRANS bit which was introduced for XFS. This patch stack also performs some refactoring and cleanup designed to make the code more maintainable and understandable. Finally, in the context of making and testing these changes several bugs were identified and resolved resulting in a more robust implementation. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes openzfs#414
Signed-off-by: Paul Dagnelie <[email protected]>
The machine in question is running 2.6.39-1 and has failed with this a few times when under load:
Sep 27 14:37:54 cheq-blz-01 kernel: INFO: task z_wr_iss/1:6318 blocked for more than 120 seconds.
Sep 27 14:37:54 cheq-blz-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 27 14:37:54 cheq-blz-01 kernel: z_wr_iss/1 D ffff88083fc519c0 0 6318 2 0x00000000
Sep 27 14:37:55 cheq-blz-01 kernel: ffff88081d657830 0000000000000046 ffff88080ab740c0 ffff880800000001
Sep 27 14:37:55 cheq-blz-01 kernel: ffff88083fd119c0 0000000000000003 ffff88083fc519c0 0000000000000004
Sep 27 14:37:55 cheq-blz-01 kernel: 0000000000000001 ffffffff8102a450 ffff88083fd119c0 ffff88080bbbc7b0
Sep 27 14:37:55 cheq-blz-01 kernel: Call Trace:
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? enqueue_task+0x51/0x5d
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? check_preempt_curr+0x27/0x66
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? try_to_wake_up+0x2f4/0x307
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? prepare_to_wait_exclusive+0x38/0x70
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? cv_wait_common+0x72/0xb8 [spl]
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? wake_up_bit+0x22/0x22
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? __wake_up+0x30/0x44
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? txg_wait_open+0x51/0x63 [zfs]
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? dmu_tx_assign+0xe5/0x331 [zfs]
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? zfs_inactive+0xa3/0x185 [zfs]
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? evict+0x67/0x106
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? dispose_list+0x25/0x33
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? shrink_icache_memory+0x22e/0x25d
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? shrink_slab+0xe1/0x152
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? try_to_free_pages+0x1ce/0x342
Sep 27 14:37:55 cheq-blz-01 kernel: [] ? __alloc_pages_nodemask+0x3bf/0x682
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? kmem_getpages+0x52/0x11a
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? fallback_alloc+0x10f/0x1a9
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? ____cache_alloc_node+0xad/0xef
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? __kmalloc+0xe4/0x154
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? kmem_alloc_debug+0x7c/0xbd [spl]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? kmem_alloc_debug+0x7c/0xbd [spl]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? vdev_raidz_io_start+0x10f/0x533 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? zio_create+0x296/0x2a8 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? zio_nowait+0xd6/0xf8 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? vdev_mirror_io_start+0x2c9/0x309 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? vdev_uberblock_sync_done+0x2d/0x2d [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? zio_vdev_io_start+0x3f/0x235 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? zio_execute+0xb4/0xd5 [zfs]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? taskq_thread+0x1b8/0x2a1 [spl]
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? try_to_wake_up+0x307/0x307
Sep 27 14:37:56 cheq-blz-01 kernel: [] ? spl_taskq_init+0x4c/0x4c [spl]
Sep 27 14:37:57 cheq-blz-01 kernel: [] ? kthread+0x7e/0x86
Sep 27 14:37:57 cheq-blz-01 kernel: [] ? kernel_thread_helper+0x4/0x10
Sep 27 14:37:57 cheq-blz-01 kernel: [] ? kthread_stop+0xa6/0xa6
Sep 27 14:37:57 cheq-blz-01 kernel: [] ? gs_change+0xb/0xb
The text was updated successfully, but these errors were encountered: