Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs hangs: "zvol blocked for more than 120 secs". #21

Closed
seriv opened this issue Jun 9, 2010 · 14 comments
Closed

zfs hangs: "zvol blocked for more than 120 secs". #21

seriv opened this issue Jun 9, 2010 · 14 comments

Comments

@seriv
Copy link

seriv commented Jun 9, 2010

I have compiled and installed spl and zfs rpms for standard FC13 kernel:
2.6.33.5-112.fc13.x86_64 ([email protected]) (gcc version 4.4.4 20100503...
I desided to test how it can handle multiple small files with random reads and writes. Formated as ext2, mounted and used for zimbra mail storage. After about an hour, when zimbra desktop pulled about 1 gig mail messages from my couple accounts, I tried some zimbra searches and reads of this messages and all zpool i/o hanged.
In /var/log/messages there were

Jun  9 11:38:25 localhost kernel: INFO: task zvol/1:2305 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: zvol/1        D 0000000000000002     0  2305      2 0x00000080
Jun  9 11:38:25 localhost kernel: ffff88007b9693c0 0000000000000046 ffff88007b9694b8 0000000000000008
Jun  9 11:38:25 localhost kernel: 00000000000157c0 00000000914d69f3 0000000000000001 ffff88007b969fd8
Jun  9 11:38:25 localhost kernel: ffff88007b969fd8 000000000000f9b0 00000000000157c0 ffff8800c81649b8
Jun  9 11:38:25 localhost kernel: Call Trace:
Jun  9 11:38:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:38:25 localhost kernel: [] ? sync_page+0x0/0x45
Jun  9 11:38:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:38:25 localhost kernel: [] sync_page+0x41/0x45
Jun  9 11:38:25 localhost kernel: [] __wait_on_bit+0x43/0x76
Jun  9 11:38:25 localhost kernel: [] wait_on_page_bit+0x6d/0x74
Jun  9 11:38:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:38:25 localhost kernel: [] ? block_write_full_page+0x10/0x12
Jun  9 11:38:25 localhost kernel: [] pageout+0x1ce/0x1fe
Jun  9 11:38:25 localhost kernel: [] shrink_page_list+0x260/0x472
Jun  9 11:38:25 localhost kernel: [] ? __delayacct_blkio_end+0x39/0x3b
Jun  9 11:38:25 localhost kernel: [] ? io_schedule_timeout+0x9d/0xc8
Jun  9 11:38:25 localhost kernel: [] ? congestion_wait+0x7b/0x8a
Jun  9 11:38:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:38:25 localhost kernel: [] shrink_inactive_list+0x455/0x675
Jun  9 11:38:25 localhost kernel: [] ? shrink_active_list+0x27a/0x29b
Jun  9 11:38:25 localhost kernel: [] ? determine_dirtyable_memory+0x15/0x27
Jun  9 11:38:25 localhost kernel: [] ? get_dirty_limits+0x22/0x24d
Jun  9 11:38:25 localhost kernel: [] shrink_zone+0x358/0x40c
Jun  9 11:38:25 localhost kernel: [] ? shrink_slab+0x145/0x157
Jun  9 11:38:25 localhost kernel: [] do_try_to_free_pages+0x20e/0x381
Jun  9 11:38:25 localhost kernel: [] try_to_free_pages+0x6a/0x6c
Jun  9 11:38:25 localhost kernel: [] ? isolate_pages_global+0x0/0x1f2
Jun  9 11:38:25 localhost kernel: [] __alloc_pages_nodemask+0x3e6/0x630
Jun  9 11:38:25 localhost kernel: [] alloc_pages_current+0x90/0x99
Jun  9 11:38:25 localhost kernel: [] __get_free_pages+0x9/0x46
Jun  9 11:38:25 localhost kernel: [] kv_alloc+0x28/0x40 [spl]
Jun  9 11:38:25 localhost kernel: [] spl_kmem_cache_alloc+0x135/0x49b [spl]
Jun  9 11:38:25 localhost kernel: [] dbuf_create+0x3e/0x2dc [zfs]
Jun  9 11:38:25 localhost kernel: [] ? dbuf_findbp+0xaa/0x12c [zfs]
Jun  9 11:38:25 localhost kernel: [] dbuf_hold_impl+0xce/0x1fc [zfs]
Jun  9 11:38:25 localhost kernel: [] dbuf_hold+0x1b/0x29 [zfs]
Jun  9 11:38:25 localhost kernel: [] dmu_buf_hold_array_by_dnode+0x182/0x2e9 [zfs]
Jun  9 11:38:25 localhost kernel: [] dmu_buf_hold_array+0x5a/0x7b [zfs]
Jun  9 11:38:25 localhost kernel: [] dmu_write_req+0x55/0x14a [zfs]
Jun  9 11:38:25 localhost kernel: [] zvol_write+0xd9/0x371 [zfs]
Jun  9 11:38:25 localhost kernel: [] taskq_thread+0x15c/0x22b [spl]
Jun  9 11:38:25 localhost kernel: [] ? default_wake_function+0x0/0xf
Jun  9 11:38:25 localhost kernel: [] ? taskq_thread+0x0/0x22b [spl]
Jun  9 11:38:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:38:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:38:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:38:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:38:25 localhost kernel: INFO: task txg_quiesce:3528 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: txg_quiesce   D 0000000000000000     0  3528      2 0x00000080
Jun  9 11:38:25 localhost kernel: ffff8800c02c1d50 0000000000000046 0000000000000000 ffff880123959f80
Jun  9 11:38:25 localhost kernel: 00000000000157c0 0000000000000000 ffff880123959f80 ffff8800c02c1fd8
Jun  9 11:38:25 localhost kernel: ffff8800c02c1fd8 000000000000f9b0 00000000000157c0 ffff88004f636108
Jun  9 11:38:25 localhost kernel: Call Trace:

Jun  9 11:38:25 localhost kernel: [] cv_wait_common+0x85/0xa9 [spl]
Jun  9 11:38:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:38:25 localhost kernel: [] __cv_wait+0xe/0x10 [spl]
Jun  9 11:38:25 localhost kernel: [] txg_wait_exit+0x4a/0x78 [zfs]
Jun  9 11:38:25 localhost kernel: [] txg_quiesce_thread+0x115/0x13f [zfs]
Jun  9 11:38:25 localhost kernel: [] ? txg_quiesce_thread+0x0/0x13f [zfs]
Jun  9 11:38:25 localhost kernel: [] thread_generic_wrapper+0x6c/0x79 [spl]
Jun  9 11:38:25 localhost kernel: [] ? thread_generic_wrapper+0x0/0x79 [spl]
Jun  9 11:38:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:38:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:38:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:38:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:38:25 localhost kernel: INFO: task txg_sync:3529 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: txg_sync      D 0000000000000000     0  3529      2 0x00000080
Jun  9 11:38:25 localhost kernel: ffff880075937d60 0000000000000046 0000000000000000 ffff880005a557c0
Jun  9 11:38:25 localhost kernel: ffff880075937cd0 ffffffff81035c68 ffff880075937cd0 ffff880075937fd8
Jun  9 11:38:25 localhost kernel: ffff880075937fd8 000000000000f9b0 00000000000157c0 ffff88004f633268
Jun  9 11:38:25 localhost kernel: Call Trace:
Jun  9 11:38:25 localhost kernel: [] ? activate_task+0x28/0x30
Jun  9 11:38:25 localhost kernel: [] cv_wait_common+0x85/0xa9 [spl]
Jun  9 11:38:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:38:25 localhost kernel: [] ? __wake_up+0x3f/0x48
Jun  9 11:38:25 localhost kernel: [] __cv_wait+0xe/0x10 [spl]
Jun  9 11:38:25 localhost kernel: [] txg_thread_wait+0x2a/0x2c [zfs]
Jun  9 11:38:25 localhost kernel: [] txg_sync_thread+0x153/0x365 [zfs]
Jun  9 11:38:25 localhost kernel: [] ? txg_sync_thread+0x0/0x365 [zfs]
Jun  9 11:38:25 localhost kernel: [] thread_generic_wrapper+0x6c/0x79 [spl]
Jun  9 11:38:25 localhost kernel: [] ? thread_generic_wrapper+0x0/0x79 [spl]
Jun  9 11:38:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:38:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:38:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:38:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:38:25 localhost kernel: INFO: task flush-230:0:16377 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: flush-230:0   D ffffffff81121e08     0 16377      2 0x00000080
Jun  9 11:38:25 localhost kernel: ffff8800750c1900 0000000000000046 ffff8800750c1860 ffff8801243a58c0
Jun  9 11:38:25 localhost kernel: ffff8801148d2930 ffff8801243a58c0 ffff8800750c1890 ffff8800750c1fd8
Jun  9 11:38:25 localhost kernel: ffff8800750c1fd8 000000000000f9b0 00000000000157c0 ffff8800be50c9b8
Jun  9 11:38:25 localhost kernel: Call Trace:

Jun  9 11:38:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:38:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:38:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:38:25 localhost kernel: [] sync_buffer+0x3b/0x3f
Jun  9 11:38:25 localhost kernel: [] __wait_on_bit_lock+0x41/0x8a
Jun  9 11:38:25 localhost kernel: [] out_of_line_wait_on_bit_lock+0x69/0x74
Jun  9 11:38:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:38:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:38:25 localhost kernel: [] __lock_buffer+0x25/0x27
Jun  9 11:38:25 localhost kernel: [] lock_buffer+0x37/0x3c
Jun  9 11:38:25 localhost kernel: [] __block_write_full_page+0x14f/0x2a8
Jun  9 11:38:25 localhost kernel: [] ? blkdev_get_block+0x0/0x5e
Jun  9 11:38:25 localhost kernel: [] ? end_buffer_async_write+0x0/0x130
Jun  9 11:38:25 localhost kernel: [] ? end_buffer_async_write+0x0/0x130
Jun  9 11:38:25 localhost kernel: [] ? blkdev_get_block+0x0/0x5e
Jun  9 11:38:25 localhost kernel: [] block_write_full_page_endio+0x84/0x90
Jun  9 11:38:25 localhost kernel: [] block_write_full_page+0x10/0x12
Jun  9 11:38:25 localhost kernel: [] blkdev_writepage+0x13/0x15
Jun  9 11:38:25 localhost kernel: [] __writepage+0x12/0x2b
Jun  9 11:38:25 localhost kernel: [] write_cache_pages+0x20a/0x340
Jun  9 11:38:25 localhost kernel: [] ? __writepage+0x0/0x2b
Jun  9 11:38:25 localhost kernel: [] generic_writepages+0x1f/0x21
Jun  9 11:38:25 localhost kernel: [] do_writepages+0x1c/0x25
Jun  9 11:38:25 localhost kernel: [] writeback_single_inode+0xbe/0x203
Jun  9 11:38:25 localhost kernel: [] writeback_inodes_wb+0x2ef/0x380
Jun  9 11:38:25 localhost kernel: [] wb_writeback+0x13d/0x1bc
Jun  9 11:38:25 localhost kernel: [] wb_do_writeback+0x1a8/0x1be
Jun  9 11:38:25 localhost kernel: [] bdi_writeback_task+0x3a/0xab
Jun  9 11:38:25 localhost kernel: [] ? bdi_start_fn+0x0/0xd4
Jun  9 11:38:25 localhost kernel: [] bdi_start_fn+0x6c/0xd4
Jun  9 11:38:25 localhost kernel: [] ? bdi_start_fn+0x0/0xd4
Jun  9 11:38:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:38:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:38:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:38:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:38:25 localhost kernel: INFO: task java:16695 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: java          D 0000000000000002     0 16695      1 0x00020080
Jun  9 11:38:25 localhost kernel: ffff880033ea5cc8 0000000000000082 0000000000000000 ffff880037be83c0
Jun  9 11:38:25 localhost kernel: ffff8801243a58c0 0000000000000001 ffff880033ea5c38 ffff880033ea5fd8
Jun  9 11:38:25 localhost kernel: ffff880033ea5fd8 000000000000f9b0 00000000000157c0 ffff880117006108
Jun  9 11:38:25 localhost kernel: Call Trace:

Jun  9 11:38:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:38:25 localhost kernel: [] ? sync_page+0x0/0x45
Jun  9 11:38:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:38:25 localhost kernel: [] sync_page+0x41/0x45
Jun  9 11:38:25 localhost kernel: [] __wait_on_bit+0x43/0x76
Jun  9 11:38:25 localhost kernel: [] wait_on_page_bit+0x6d/0x74
Jun  9 11:38:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:38:25 localhost kernel: [] ? pagevec_lookup_tag+0x20/0x29
Jun  9 11:38:25 localhost kernel: [] filemap_fdatawait_range+0x9f/0x173
Jun  9 11:38:25 localhost kernel: [] filemap_write_and_wait_range+0x3e/0x51
Jun  9 11:38:25 localhost kernel: [] vfs_fsync_range+0x5a/0xad
Jun  9 11:38:25 localhost kernel: [] vfs_fsync+0x18/0x1a
Jun  9 11:38:25 localhost kernel: [] do_fsync+0x2f/0x44
Jun  9 11:38:25 localhost kernel: [] sys_fsync+0xb/0xf
Jun  9 11:38:25 localhost kernel: [] sysenter_dispatch+0x7/0x2e
Jun  9 11:38:25 localhost kernel: INFO: task java:16793 blocked for more than 120 seconds.
Jun  9 11:38:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:38:25 localhost kernel: java          D 0000000000000002     0 16793      1 0x00020080
Jun  9 11:38:25 localhost kernel: ffff88004fad1bf8 0000000000000082 ffff8801243a58c0 00000000000004c1
Jun  9 11:38:25 localhost kernel: ffff8801148d3e30 0000000000000092 ffff88004fad1b88 ffff88004fad1fd8
Jun  9 11:38:25 localhost kernel: ffff88004fad1fd8 000000000000f9b0 00000000000157c0 ffff88012623c9b8
Jun  9 11:38:25 localhost kernel: Call Trace:
Jun  9 11:38:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:38:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:38:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:38:25 localhost kernel: [] sync_buffer+0x3b/0x3f
Jun  9 11:38:25 localhost kernel: [] __wait_on_bit+0x43/0x76
Jun  9 11:38:25 localhost kernel: [] out_of_line_wait_on_bit+0x69/0x74
Jun  9 11:38:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:38:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:38:25 localhost kernel: [] __wait_on_buffer+0x1f/0x21
Jun  9 11:38:25 localhost kernel: [] wait_on_buffer+0x38/0x3c
Jun  9 11:38:25 localhost kernel: [] sync_dirty_buffer+0x5b/0x94
Jun  9 11:38:25 localhost kernel: [] ext2_write_inode+0x291/0x319 [ext2]
Jun  9 11:38:25 localhost kernel: [] writeback_single_inode+0x11f/0x203
Jun  9 11:38:25 localhost kernel: [] sync_inode+0x28/0x36
Jun  9 11:38:25 localhost kernel: [] simple_fsync+0x58/0x6a
Jun  9 11:38:25 localhost kernel: [] ext2_fsync+0x2a/0x5d [ext2]
Jun  9 11:38:25 localhost kernel: [] vfs_fsync_range+0x7d/0xad
Jun  9 11:38:25 localhost kernel: [] vfs_fsync+0x18/0x1a
Jun  9 11:38:25 localhost kernel: [] do_fsync+0x2f/0x44
Jun  9 11:38:25 localhost kernel: [] sys_fsync+0xb/0xf
Jun  9 11:38:25 localhost kernel: [] sysenter_dispatch+0x7/0x2e
Jun  9 11:40:25 localhost kernel: INFO: task zvol/1:2305 blocked for more than 120 seconds.
Jun  9 11:40:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:40:25 localhost kernel: zvol/1        D 0000000000000002     0  2305      2 0x00000080
Jun  9 11:40:25 localhost kernel: ffff88007b9693c0 0000000000000046 ffff88007b9694b8 0000000000000008
Jun  9 11:40:25 localhost kernel: 00000000000157c0 00000000914d69f3 0000000000000001 ffff88007b969fd8
Jun  9 11:40:25 localhost kernel: ffff88007b969fd8 000000000000f9b0 00000000000157c0 ffff8800c81649b8

Jun  9 11:40:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:40:25 localhost kernel: [] ? sync_page+0x0/0x45
Jun  9 11:40:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:40:25 localhost kernel: [] sync_page+0x41/0x45
Jun  9 11:40:25 localhost kernel: [] __wait_on_bit+0x43/0x76
Jun  9 11:40:25 localhost kernel: [] wait_on_page_bit+0x6d/0x74
Jun  9 11:40:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:40:25 localhost kernel: [] ? block_write_full_page+0x10/0x12
Jun  9 11:40:25 localhost kernel: [] pageout+0x1ce/0x1fe
Jun  9 11:40:25 localhost kernel: [] shrink_page_list+0x260/0x472
Jun  9 11:40:25 localhost kernel: [] ? __delayacct_blkio_end+0x39/0x3b
Jun  9 11:40:25 localhost kernel: [] ? io_schedule_timeout+0x9d/0xc8
Jun  9 11:40:25 localhost kernel: [] ? congestion_wait+0x7b/0x8a
Jun  9 11:40:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:40:25 localhost kernel: [] shrink_inactive_list+0x455/0x675
Jun  9 11:40:25 localhost kernel: [] ? shrink_active_list+0x27a/0x29b
Jun  9 11:40:25 localhost kernel: [] ? determine_dirtyable_memory+0x15/0x27
Jun  9 11:40:25 localhost kernel: [] ? get_dirty_limits+0x22/0x24d
Jun  9 11:40:25 localhost kernel: [] shrink_zone+0x358/0x40c
Jun  9 11:40:25 localhost kernel: [] ? shrink_slab+0x145/0x157
Jun  9 11:40:25 localhost kernel: [] do_try_to_free_pages+0x20e/0x381
Jun  9 11:40:25 localhost kernel: [] try_to_free_pages+0x6a/0x6c
Jun  9 11:40:25 localhost kernel: [] ? isolate_pages_global+0x0/0x1f2
Jun  9 11:40:25 localhost kernel: [] __alloc_pages_nodemask+0x3e6/0x630
Jun  9 11:40:25 localhost kernel: [] alloc_pages_current+0x90/0x99
Jun  9 11:40:25 localhost kernel: [] __get_free_pages+0x9/0x46
Jun  9 11:40:25 localhost kernel: [] kv_alloc+0x28/0x40 [spl]
Jun  9 11:40:25 localhost kernel: [] spl_kmem_cache_alloc+0x135/0x49b [spl]
Jun  9 11:40:25 localhost kernel: [] dbuf_create+0x3e/0x2dc [zfs]
Jun  9 11:40:25 localhost kernel: [] ? dbuf_findbp+0xaa/0x12c [zfs]
Jun  9 11:40:25 localhost kernel: [] dbuf_hold_impl+0xce/0x1fc [zfs]
Jun  9 11:40:25 localhost kernel: [] dbuf_hold+0x1b/0x29 [zfs]
Jun  9 11:40:25 localhost kernel: [] dmu_buf_hold_array_by_dnode+0x182/0x2e9 [zfs]
Jun  9 11:40:25 localhost kernel: [] dmu_buf_hold_array+0x5a/0x7b [zfs]
Jun  9 11:40:25 localhost kernel: [] dmu_write_req+0x55/0x14a [zfs]
Jun  9 11:40:25 localhost kernel: [] zvol_write+0xd9/0x371 [zfs]
Jun  9 11:40:25 localhost kernel: [] taskq_thread+0x15c/0x22b [spl]
Jun  9 11:40:25 localhost kernel: [] ? default_wake_function+0x0/0xf
Jun  9 11:40:25 localhost kernel: [] ? taskq_thread+0x0/0x22b [spl]
Jun  9 11:40:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:40:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:40:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:40:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:40:25 localhost kernel: INFO: task txg_quiesce:3528 blocked for more than 120 seconds.
Jun  9 11:40:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:40:25 localhost kernel: txg_quiesce   D 0000000000000000     0  3528      2 0x00000080
Jun  9 11:40:25 localhost kernel: ffff8800c02c1d50 0000000000000046 0000000000000000 ffff880123959f80
Jun  9 11:40:25 localhost kernel: 00000000000157c0 0000000000000000 ffff880123959f80 ffff8800c02c1fd8
Jun  9 11:40:25 localhost kernel: ffff8800c02c1fd8 000000000000f9b0 00000000000157c0 ffff88004f636108
Jun  9 11:40:25 localhost kernel: Call Trace:

Jun  9 11:40:25 localhost kernel: [] cv_wait_common+0x85/0xa9 [spl]
Jun  9 11:40:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:40:25 localhost kernel: [] __cv_wait+0xe/0x10 [spl]
Jun  9 11:40:25 localhost kernel: [] txg_wait_exit+0x4a/0x78 [zfs]
Jun  9 11:40:25 localhost kernel: [] txg_quiesce_thread+0x115/0x13f [zfs]
Jun  9 11:40:25 localhost kernel: [] ? txg_quiesce_thread+0x0/0x13f [zfs]
Jun  9 11:40:25 localhost kernel: [] thread_generic_wrapper+0x6c/0x79 [spl]
Jun  9 11:40:25 localhost kernel: [] ? thread_generic_wrapper+0x0/0x79 [spl]
Jun  9 11:40:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:40:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:40:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:40:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:40:25 localhost kernel: INFO: task txg_sync:3529 blocked for more than 120 seconds.
Jun  9 11:40:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:40:25 localhost kernel: txg_sync      D 0000000000000000     0  3529      2 0x00000080
Jun  9 11:40:25 localhost kernel: ffff880075937d60 0000000000000046 0000000000000000 ffff880005a557c0
Jun  9 11:40:25 localhost kernel: ffff880075937cd0 ffffffff81035c68 ffff880075937cd0 ffff880075937fd8
Jun  9 11:40:25 localhost kernel: ffff880075937fd8 000000000000f9b0 00000000000157c0 ffff88004f633268
Jun  9 11:40:25 localhost kernel: Call Trace:
Jun  9 11:40:25 localhost kernel: [] ? activate_task+0x28/0x30
Jun  9 11:40:25 localhost kernel: [] cv_wait_common+0x85/0xa9 [spl]
Jun  9 11:40:25 localhost kernel: [] ? autoremove_wake_function+0x0/0x34
Jun  9 11:40:25 localhost kernel: [] ? __wake_up+0x3f/0x48
Jun  9 11:40:25 localhost kernel: [] __cv_wait+0xe/0x10 [spl]
Jun  9 11:40:25 localhost kernel: [] txg_thread_wait+0x2a/0x2c [zfs]
Jun  9 11:40:25 localhost kernel: [] txg_sync_thread+0x153/0x365 [zfs]
Jun  9 11:40:25 localhost kernel: [] ? txg_sync_thread+0x0/0x365 [zfs]
Jun  9 11:40:25 localhost kernel: [] thread_generic_wrapper+0x6c/0x79 [spl]
Jun  9 11:40:25 localhost kernel: [] ? thread_generic_wrapper+0x0/0x79 [spl]
Jun  9 11:40:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:40:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:40:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:40:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
Jun  9 11:40:25 localhost kernel: INFO: task flush-230:0:16377 blocked for more than 120 seconds.
Jun  9 11:40:25 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  9 11:40:25 localhost kernel: flush-230:0   D ffffffff81121e08     0 16377      2 0x00000080
Jun  9 11:40:25 localhost kernel: ffff8800750c1900 0000000000000046 ffff8800750c1860 ffff8801243a58c0
Jun  9 11:40:25 localhost kernel: ffff8801148d2930 ffff8801243a58c0 ffff8800750c1890 ffff8800750c1fd8
Jun  9 11:40:25 localhost kernel: ffff8800750c1fd8 000000000000f9b0 00000000000157c0 ffff8800be50c9b8

Jun  9 11:40:25 localhost kernel: Call Trace:
Jun  9 11:40:25 localhost kernel: [] ? ktime_get_ts+0xad/0xba
Jun  9 11:40:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:40:25 localhost kernel: [] io_schedule+0x6e/0xb0
Jun  9 11:40:25 localhost kernel: [] sync_buffer+0x3b/0x3f
Jun  9 11:40:25 localhost kernel: [] __wait_on_bit_lock+0x41/0x8a
Jun  9 11:40:25 localhost kernel: [] out_of_line_wait_on_bit_lock+0x69/0x74
Jun  9 11:40:25 localhost kernel: [] ? sync_buffer+0x0/0x3f
Jun  9 11:40:25 localhost kernel: [] ? wake_bit_function+0x0/0x2e
Jun  9 11:40:25 localhost kernel: [] __lock_buffer+0x25/0x27
Jun  9 11:40:25 localhost kernel: [] lock_buffer+0x37/0x3c
Jun  9 11:40:25 localhost kernel: [] __block_write_full_page+0x14f/0x2a8
Jun  9 11:40:25 localhost kernel: [] ? blkdev_get_block+0x0/0x5e
Jun  9 11:40:25 localhost kernel: [] ? end_buffer_async_write+0x0/0x130
Jun  9 11:40:25 localhost kernel: [] ? end_buffer_async_write+0x0/0x130
Jun  9 11:40:25 localhost kernel: [] ? blkdev_get_block+0x0/0x5e
Jun  9 11:40:25 localhost kernel: [] block_write_full_page_endio+0x84/0x90
Jun  9 11:40:25 localhost kernel: [] block_write_full_page+0x10/0x12
Jun  9 11:40:25 localhost kernel: [] blkdev_writepage+0x13/0x15
Jun  9 11:40:25 localhost kernel: [] __writepage+0x12/0x2b
Jun  9 11:40:25 localhost kernel: [] write_cache_pages+0x20a/0x340
Jun  9 11:40:25 localhost kernel: [] ? __writepage+0x0/0x2b
Jun  9 11:40:25 localhost kernel: [] generic_writepages+0x1f/0x21
Jun  9 11:40:25 localhost kernel: [] do_writepages+0x1c/0x25
Jun  9 11:40:25 localhost kernel: [] writeback_single_inode+0xbe/0x203
Jun  9 11:40:25 localhost kernel: [] writeback_inodes_wb+0x2ef/0x380
Jun  9 11:40:25 localhost kernel: [] wb_writeback+0x13d/0x1bc
Jun  9 11:40:25 localhost kernel: [] wb_do_writeback+0x1a8/0x1be
Jun  9 11:40:25 localhost kernel: [] bdi_writeback_task+0x3a/0xab
Jun  9 11:40:25 localhost kernel: [] ? bdi_start_fn+0x0/0xd4
Jun  9 11:40:25 localhost kernel: [] bdi_start_fn+0x6c/0xd4
Jun  9 11:40:25 localhost kernel: [] ? bdi_start_fn+0x0/0xd4
Jun  9 11:40:25 localhost kernel: [] kthread+0x7a/0x82
Jun  9 11:40:25 localhost kernel: [] kernel_thread_helper+0x4/0x10
Jun  9 11:40:25 localhost kernel: [] ? kthread+0x0/0x82
Jun  9 11:40:25 localhost kernel: [] ? kernel_thread_helper+0x0/0x10
@behlendorf
Copy link
Contributor

Thanks for the bug report. The first process is in the process of memory reclaim which should be OK but it is hung. The ZVOL does use twice as much memory as it should at the moment because there ends up being a copy in the arc as well as the Liunx buffer cache. With the stock distro kernels there's no easy way to disable the Linux buffer cache. Anyway, you can disable the arc cache which helps. Can you try this and let me know.

zfs set primarycache=metadata

@seriv
Copy link
Author

seriv commented Jun 10, 2010

Hi Brian, I am sorry, I can't.
It seems that I've spl built and working with 2.6.33.5-112.fc13.x86_64 'accidentally',
I think I did not run 'make clean' when switching from one kernel to another one And I did not saved good rpms. Now the sources from http://github.com/seriv/spl.git are building successfully, but I can't insmod spl:

Jun 10 09:45:23 localhost kernel: Error: Unknown symbol get_zone_counts
Jun 10 09:45:23 localhost kernel: SPL: Failed to Load Solaris Porting Layer v0.4.9, rc = -14

@behlendorf
Copy link
Contributor

I would suggest running 'make distclean' in both the spl and zfs trees, re-running configure and building both projects and see if you get the same result.

@seriv
Copy link
Author

seriv commented Jun 10, 2010

No, it's the same:

make distclean && sh autogen.sh &&  ./configure && make && sudo make check
...

make[1]: Entering directory `/home/seriv/work/seriv/spl/scripts'
./check.sh
Loading ../module/spl/spl.ko
insmod: error inserting '../module/spl/spl.ko': -1 Bad address
check.sh: Failed to load ../module/spl/spl.ko
make[1]: *** [check] Error 1
make[1]: Leaving directory `/home/seriv/work/seriv/spl/scripts'
make: *** [check-recursive] Error 1

$ sudo tail -2 /var/log/messages

Jun 10 17:13:49 gauntlet kernel: Error: Unknown symbol get_zone_counts
Jun 10 17:13:49 gauntlet kernel: SPL: Failed to Load Solaris Porting Layer v0.4.9, rc = -14

@seriv
Copy link
Author

seriv commented Jun 11, 2010

Hi Brian, I've did it, - in FC12, with zfs set primarycache=metadata on the same hardware and with imported zpool, the same tasks are running fine. Thank you for advise.

@behlendorf
Copy link
Contributor

Seriv, thanks for the update and sticking with it. This is something I certainly hope to resolve for the next tag.

@seriv
Copy link
Author

seriv commented Jun 15, 2010

Hi Brian, was it a mistake in config/spl-build.m4 of spl, or I accidentally got it compiled just for one particular kernel version: see http://github.com/seriv/spl/commit/3841d8eb6dd1863c2f7bbf51c945b7b06d26fe2e ?

@wizeman
Copy link

wizeman commented Jul 6, 2010

I believe the problem here is that one (or more) of the ZFS threads is getting deadlocked.

This is likely happening because one of the ZFS threads which does I/O is trying to allocate memory, but since we're having memory pressure, it causes dirty data to be synced to the ZVol. However, this dirty data can't be synced because it needs the original thread to proceed with doing the I/O, but it can't proceed because it needs the data to be synced.

I have observed a very similar issue while testing Lustre on top of the DMU.

The proper solution in our case was to change this line in spl/include/sys/kmem.h:

#define KM_SLEEP GFP_KERNEL

Into this:

#define KM_SLEEP GFP_NOFS

This causes the ZFS threads not to call filesystem code, which in the Lustre case was causing a deadlock.

However, given that I don't see filesystem code in the stack trace, I think for the ZVol case to work correctly, the proper solution may be even more extreme and the line above may have to be changed into this:

#define KM_SLEEP GFP_NOIO

I think that will fix these hangs.

@wizeman
Copy link

wizeman commented Jul 6, 2010

A colleague just pointed out to me that shrink_page_list() only calls pageout() if this condition is true:

             may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
                     (PageSwapCache(page) && (sc->gfp_mask & __GFP_IO));

Which means that it should be OK to use GFP_NOFS unless the ZVol is a swap device.

However, you shouldn't even be using ZVols as swap devices anyway, at least not without a proper implementation, because the ZFS threads are very allocation-intensive, which doesn't play well when you're swapping out. I think even Solaris doesn't rely on the ZFS threads for swapping out to a ZVol (which is probably why it requires compression and checksumming to be disabled, if I'm not mistaken).

@behlendorf
Copy link
Contributor

Thanks wizeman, I think your right that would nicely explain the problem. That's a pretty a heavy handed fix though, it would be ideal if we could just identify the offending allocations and add a KM_NOFS and use it as appropriate. It looks like the offending allocation was the kmem_cache_alloc() in dbuf_create() this time but I'm sure there quite a few other places where the deadlock is possible.

@wizeman
Copy link

wizeman commented Jul 7, 2010

The problem is that almost any allocation might trigger the deadlock.

For example, any allocation in the ZIO path (including the compression and checksum phases, etc) have to be GFP_NOFS because otherwise the ZIOs will deadlock waiting for data to be synced (which need the ZIOs to complete).

Similarly, any allocation during any phase of the txg train (in the open txg, quiescing txg or syncing txg) cannot block waiting for new data to sync, because the data that needs to sync might not fit in the currently-open txg, but at the same time you may not be able to open a new txg because you'd need the txg train to advance, which cannot happen because you're blocked trying to allocate memory. This one alone probably rules out more than 90% of the code, and 99% of the allocations.. :-) It includes practicall all allocations in the DMU code, dbufs, nvpairs, dsl code, spa, metaslabs, dnodes, spacemaps, ZAPs, raidz, zil, zvol, objset, etc.. all of that is needed to proceed with syncing a txg.

Also, probably all the code in the ARC, including the thread that reclaims memory from the ARC, has to use GFP_NOFS, otherwise the code could block while holding locks in the ARC, which could prevent IOs to complete.

TBH I have trouble thinking of any allocations which should not use GFP_NOFS.. even the ZPL code will probably need it in most cases, if blocking on the allocation could prevent other data to be synced...

@behlendorf
Copy link
Contributor

Yes, I agree but I feel compelled to point our we'll be basically removing almost the entire synchronous memory reclaim path for ZFS. The only way we'll end up performing reclaim with be through the async reclaim threads, or by another non-ZFS related process in the kernel. This consequence of this will likely be that memory allocations will take longer when the system is low on memory since our allocation may need block until the async thread performs the reclaim.

Anyway, this is still much much much better than deadlocking so let's give it try and see what actual consequences there are.

@behlendorf
Copy link
Contributor

OK, wizeman's proposed fix has been committed to the SPL as behlendorf/spl@82b8c8f. Hopefully, that will resolve this issue. I'm closing the bug until we see it again.

@zeigerpuppy
Copy link

I am getting this error reproducibly when sending many files by rsync over ssh to a xen virtual machine with a large sparse ZVOL. Debian Wheezy/unstable amd64, zfsonlinux 2~wheezy
The zvol was produced with: zfs create -V 1TB -b 4096 -s /zpool1/big
It is formatted with ext4, the Dom0 running zfsonlinux has 32GB of RAM and 30GB of swap on a different zpool (SSD drive).
The data is mixed, mostly small files and the error occurs after sending between 11 and 60GB of data at a rate of about 40MB/sec. It seemed to occur later when I increased the DomU ram from 24 to 32GB.

behlendorf added a commit to behlendorf/zfs that referenced this issue Jun 24, 2016
Flag openzfs#20 was used in OpenZFS as DMU_BACKUP_FEATURE_RESUMING, the
DMU_BACKUP_FEATURE_LARGE_DNODE flag must be shifted to openzfs#21 and
reserved in the  OpenZFS implementation.

Signed-off-by: Brian Behlendorf <[email protected]>
akatrevorjay added a commit to akatrevorjay/zfs that referenced this issue Dec 16, 2017
# This is the 1st commit message:
Merge branch 'master' of https://github.com/zfsonlinux/zfs

* 'master' of https://github.com/zfsonlinux/zfs:
  Enable QAT support in zfs-dkms RPM

# This is the commit message openzfs#2:

Import 0.6.5.7-0ubuntu3

# This is the commit message openzfs#3:

gbp changes

# This is the commit message openzfs#4:

Bump ver

# This is the commit message openzfs#5:

-j9 baby

# This is the commit message openzfs#6:

Up

# This is the commit message openzfs#7:

Yup

# This is the commit message openzfs#8:

Add new module

# This is the commit message openzfs#9:

Up

# This is the commit message openzfs#10:

Up

# This is the commit message openzfs#11:

Bump

# This is the commit message openzfs#12:

Grr

# This is the commit message openzfs#13:

Yay

# This is the commit message openzfs#14:

Yay

# This is the commit message openzfs#15:

Yay

# This is the commit message openzfs#16:

Yay

# This is the commit message openzfs#17:

Yay

# This is the commit message openzfs#18:

Yay

# This is the commit message openzfs#19:

yay

# This is the commit message openzfs#20:

yay

# This is the commit message openzfs#21:

yay

# This is the commit message openzfs#22:

Update ppa script

# This is the commit message openzfs#23:

Update gbp conf with br changes

# This is the commit message openzfs#24:

Update gbp conf with br changes

# This is the commit message openzfs#25:

Bump

# This is the commit message openzfs#26:

No pristine

# This is the commit message openzfs#27:

Bump

# This is the commit message openzfs#28:

Lol whoops

# This is the commit message openzfs#29:

Fix name

# This is the commit message openzfs#30:

Fix name

# This is the commit message openzfs#31:

rebase

# This is the commit message openzfs#32:

Bump

# This is the commit message openzfs#33:

Bump

# This is the commit message openzfs#34:

Bump

# This is the commit message openzfs#35:

Bump

# This is the commit message openzfs#36:

ntrim

# This is the commit message openzfs#37:

Bump

# This is the commit message openzfs#38:

9

# This is the commit message openzfs#39:

Bump

# This is the commit message openzfs#40:

Bump

# This is the commit message openzfs#41:

Bump

# This is the commit message openzfs#42:

Revert "9"

This reverts commit de488f1.

# This is the commit message openzfs#43:

Bump

# This is the commit message openzfs#44:

Account for zconfig.sh being removed

# This is the commit message openzfs#45:

Bump

# This is the commit message openzfs#46:

Add artful

# This is the commit message openzfs#47:

Add in zed.d and zpool.d scripts

# This is the commit message openzfs#48:

Bump

# This is the commit message openzfs#49:

Bump

# This is the commit message openzfs#50:

Bump

# This is the commit message openzfs#51:

Bump

# This is the commit message openzfs#52:

ugh

# This is the commit message openzfs#53:

fix zed upgrade

# This is the commit message openzfs#54:

Bump

# This is the commit message openzfs#55:

conf file zed.d

# This is the commit message #56:

Bump
markroper added a commit to markroper/zfs that referenced this issue Feb 12, 2020
Using zfs with Lustre, an arc_read can trigger kernel memory allocation
that in turn leads to a memory reclaim callback and a deadlock within a
single zfs process. This change uses spl_fstrans_mark and
spl_trans_unmark to prevent the reclaim attempt and the deadlock
(https://zfsonlinux.topicbox.com/groups/zfs-devel/T4db2c705ec1804ba).
The stack trace observed is:

     #0 [ffffc9002b98adc8] __schedule at ffffffff81610f2e
     openzfs#1 [ffffc9002b98ae68] schedule at ffffffff81611558
     openzfs#2 [ffffc9002b98ae70] schedule_preempt_disabled at ffffffff8161184a
     openzfs#3 [ffffc9002b98ae78] __mutex_lock at ffffffff816131e8
     openzfs#4 [ffffc9002b98af18] arc_buf_destroy at ffffffffa0bf37d7 [zfs]
     openzfs#5 [ffffc9002b98af48] dbuf_destroy at ffffffffa0bfa6fe [zfs]
     openzfs#6 [ffffc9002b98af88] dbuf_evict_one at ffffffffa0bfaa96 [zfs]
     openzfs#7 [ffffc9002b98afa0] dbuf_rele_and_unlock at ffffffffa0bfa561 [zfs]
     openzfs#8 [ffffc9002b98b050] dbuf_rele_and_unlock at ffffffffa0bfa32b [zfs]
     openzfs#9 [ffffc9002b98b100] osd_object_delete at ffffffffa0b64ecc [osd_zfs]
    openzfs#10 [ffffc9002b98b118] lu_object_free at ffffffffa06d6a74 [obdclass]
    openzfs#11 [ffffc9002b98b178] lu_site_purge_objects at ffffffffa06d7fc1 [obdclass]
    openzfs#12 [ffffc9002b98b220] lu_cache_shrink_scan at ffffffffa06d81b8 [obdclass]
    openzfs#13 [ffffc9002b98b278] shrink_slab at ffffffff811ca9d8
    openzfs#14 [ffffc9002b98b338] shrink_node at ffffffff811cfd94
    openzfs#15 [ffffc9002b98b3b8] do_try_to_free_pages at ffffffff811cfe63
    openzfs#16 [ffffc9002b98b408] try_to_free_pages at ffffffff811d01c4
    openzfs#17 [ffffc9002b98b488] __alloc_pages_slowpath at ffffffff811be7f2
    openzfs#18 [ffffc9002b98b580] __alloc_pages_nodemask at ffffffff811bf3ed
    openzfs#19 [ffffc9002b98b5e0] new_slab at ffffffff81226304
    openzfs#20 [ffffc9002b98b638] ___slab_alloc at ffffffff812272ab
    openzfs#21 [ffffc9002b98b6f8] __slab_alloc at ffffffff8122740c
    openzfs#22 [ffffc9002b98b708] kmem_cache_alloc at ffffffff81227578
    openzfs#23 [ffffc9002b98b740] spl_kmem_cache_alloc at ffffffffa048a1fd [spl]
    openzfs#24 [ffffc9002b98b780] arc_buf_alloc_impl at ffffffffa0befba2 [zfs]
    openzfs#25 [ffffc9002b98b7b0] arc_read at ffffffffa0bf0924 [zfs]
    openzfs#26 [ffffc9002b98b858] dbuf_read at ffffffffa0bf9083 [zfs]
    openzfs#27 [ffffc9002b98b900] dmu_buf_hold_by_dnode at ffffffffa0c04869 [zfs]

Signed-off-by: Mark Roper <[email protected]>
allanjude pushed a commit to KlaraSystems/zfs that referenced this issue Apr 28, 2020
sdimitro pushed a commit to sdimitro/zfs that referenced this issue Dec 7, 2021
Make 2 changes that are not strictly required but make sense given the
changes in the 2021 edition:

1. TryFrom/TryInto no longer need to be imported, as they are part of
the standard prelude.  Remove the redundant `use` statements.

2. When possible, closures capture members of local variables, rather
than the entire variable.  This can cause the non-captured members to be
dropped later than in the 2018 edition.  In our case (in
ZettaCache::open()), this doesn't really matter, but it brought to my
attention that we're cloning and then discarding a bunch of the members
of ZettaCache (without using them).  So I changed this to just clone the
few members that are actually needed by the closures.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants