0.6.5.2 hangs on large rsync jobs #3867

rgmiller · 2015-10-01T17:23:56Z

I'm afraid 0.6.5.2 still has problems with large rsync jobs (from BackupPC). My server isn't completely locked up; I can at least log in and poke around. All 4 cores (the CPU is a quad core Xeon) are sitting at 100%, though. Access to the zpool that was being written to appears to be completely blocked (a simple 'ls' hung) but I can read and write to a separate zpool on the same server.

This may be related to issue #3808. (It's the same computer running the same BackupPC jobs anyway.)

I'll paste the stack traces below.

rgmiller · 2015-10-01T17:25:55Z

Stack traces for the hung processes (from dmesg):

[23617.581529] INFO: task BackupPC_dump:20068 blocked for more than 120 seconds.
[23617.581549] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[23617.581566] BackupPC_dump D ffff88041e213680 0 20068 3636 0x00000080
[23617.581569] ffff8801fffd7e38 0000000000000082 ffff8801c383ad80 ffff8801fffd7fd8
[23617.581570] ffff8801fffd7fd8 ffff8801fffd7fd8 ffff8801c383ad80 ffff8801c383ad80
[23617.581571] ffff8804045238b8 ffff8804045238c0 ffffffff00000000 ffff8804045238c8
[23617.581573] Call Trace:
[23617.581578] [] schedule+0x29/0x70
[23617.581580] [] rwsem_down_write_failed+0x115/0x220
[23617.581583] [] call_rwsem_down_write_failed+0x13/0x20
[23617.581585] [] ? down_write+0x2d/0x30
[23617.581587] [] vm_munmap+0x33/0x60
[23617.581588] [] SyS_munmap+0x22/0x30
[23617.581590] [] system_call_fastpath+0x16/0x1b
[23737.465775] INFO: task BackupPC_dump:20068 blocked for more than 120 seconds.
[23737.465793] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[23737.465811] BackupPC_dump D ffff88041e213680 0 20068 3636 0x00000080
[23737.465814] ffff8801fffd7e38 0000000000000082 ffff8801c383ad80 ffff8801fffd7fd8
[23737.465815] ffff8801fffd7fd8 ffff8801fffd7fd8 ffff8801c383ad80 ffff8801c383ad80
[23737.465817] ffff8804045238b8 ffff8804045238c0 ffffffff00000000 ffff8804045238c8
[23737.465818] Call Trace:
[23737.465823] [] schedule+0x29/0x70
[23737.465825] [] rwsem_down_write_failed+0x115/0x220
[23737.465828] [] call_rwsem_down_write_failed+0x13/0x20
[23737.465830] [] ? down_write+0x2d/0x30
[23737.465832] [] vm_munmap+0x33/0x60
[23737.465834] [] SyS_munmap+0x22/0x30
[23737.465836] [] system_call_fastpath+0x16/0x1b
[25295.964572] INFO: task BackupPC_dump:20135 blocked for more than 120 seconds.
[25295.964591] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25295.964609] BackupPC_dump D ffff88041e313680 0 20135 20072 0x00000080
[25295.964612] ffff88015025fb70 0000000000000082 ffff880024778000 ffff88015025ffd8
[25295.964613] ffff88015025ffd8 ffff88015025ffd8 ffff880024778000 ffff8801f308fa88
[25295.964615] ffff8801f308fa8c ffff880024778000 00000000ffffffff ffff8801f308fa90
[25295.964617] Call Trace:
[25295.964622] [] schedule_preempt_disabled+0x29/0x70
[25295.964624] [] __mutex_lock_slowpath+0xc5/0x1c0
[25295.964625] [] mutex_lock+0x1f/0x2f
[25295.964627] [] lookup_slow+0x33/0xa7
[25295.964630] [] link_path_walk+0x80f/0x8b0
[25295.964632] [] ? __d_lookup+0x120/0x160
[25295.964634] [] path_lookupat+0x6b/0x7a0
[25295.964636] [] ? kmem_cache_alloc+0x35/0x1d0
[25295.964637] [] ? getname_flags+0x4f/0x1a0
[25295.964638] [] filename_lookup+0x2b/0xc0
[25295.964640] [] user_path_at_empty+0x67/0xc0
[25295.964643] [] ? from_kgid_munged+0x12/0x20
[25295.964645] [] ? cp_new_stat+0x14f/0x180
[25295.964646] [] user_path_at+0x11/0x20
[25295.964647] [] vfs_fstatat+0x63/0xc0
[25295.964649] [] SYSC_newstat+0x2e/0x60
[25295.964651] [] ? __audit_syscall_exit+0x1e6/0x280
[25295.964652] [] SyS_newstat+0xe/0x10
[25295.964654] [] system_call_fastpath+0x16/0x1b
[25645.496136] perf interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[26734.576014] INFO: task spl_dynamic_tas:123 blocked for more than 120 seconds.
[26734.576040] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26734.576058] spl_dynamic_tas D ffff88041e313680 0 123 2 0x00000000
[26734.576061] ffff88040404bb70 0000000000000046 ffff880404692d80 ffff88040404bfd8
[26734.576063] ffff88040404bfd8 ffff88040404bfd8 ffff880404692d80 ffff88040404bcd8
[26734.576064] ffff88040404bce0 7fffffffffffffff ffff880404692d80 ffff880406b92500
[26734.576066] Call Trace:
[26734.576071] [] schedule+0x29/0x70
[26734.576073] [] schedule_timeout+0x209/0x2d0
[26734.576077] [] ? check_preempt_curr+0x85/0xa0
[26734.576078] [] ? ttwu_do_wakeup+0x19/0xd0
[26734.576080] [] wait_for_completion+0x116/0x170
[26734.576082] [] ? wake_up_state+0x20/0x20
[26734.576101] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[26734.576104] [] kthread_create_on_node+0xa8/0x140
[26734.576107] [] ? string.isra.6+0x3b/0xf0
[26734.576113] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[26734.576120] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[26734.576125] [] spl_kthread_create+0x9c/0xf0 [spl]
[26734.576130] [] taskq_thread_create+0x6b/0x110 [spl]
[26734.576135] [] taskq_thread_spawn_task+0x12/0x30 [spl]
[26734.576139] [] taskq_thread+0x21e/0x420 [spl]
[26734.576141] [] ? wake_up_state+0x20/0x20
[26734.576145] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[26734.576146] [] kthread+0xcf/0xe0
[26734.576148] [] ? kthread_create_on_node+0x140/0x140
[26734.576150] [] ret_from_fork+0x58/0x90
[26734.576151] [] ? kthread_create_on_node+0x140/0x140
[26734.576174] INFO: task kworker/1:0:12006 blocked for more than 120 seconds.
[26734.576190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26734.576207] kworker/1:0 D ffff88041e293680 0 12006 2 0x00000080
[26734.576212] ffff88000fcbfbf0 0000000000000046 ffff880185d5ad80 ffff88000fcbffd8
[26734.576213] ffff88000fcbffd8 ffff88000fcbffd8 ffff880185d5ad80 ffff88000fcbfd58
[26734.576214] ffff88000fcbfd60 7fffffffffffffff ffff880185d5ad80 ffff88000fcbfdd8
[26734.576216] Call Trace:
[26734.576218] [] schedule+0x29/0x70
[26734.576220] [] schedule_timeout+0x209/0x2d0
[26734.576221] [] ? sched_clock_cpu+0xb5/0x100
[26734.576223] [] ? check_preempt_curr+0x75/0xa0
[26734.576225] [] ? ttwu_do_wakeup+0x19/0xd0
[26734.576226] [] wait_for_completion+0x116/0x170
[26734.576227] [] ? wake_up_state+0x20/0x20
[26734.576229] [] kthread_create_on_node+0xa8/0x140
[26734.576231] [] ? rescuer_thread+0x400/0x400
[26734.576233] [] create_worker+0xea/0x250
[26734.576234] [] manage_workers.isra.25+0xf6/0x2d0
[26734.576236] [] worker_thread+0x339/0x400
[26734.576237] [] ? rescuer_thread+0x400/0x400
[26734.576238] [] kthread+0xcf/0xe0
[26734.576240] [] ? kthread_create_on_node+0x140/0x140
[26734.576241] [] ret_from_fork+0x58/0x90
[26734.576243] [] ? kthread_create_on_node+0x140/0x140
[30091.338924] INFO: task spl_dynamic_tas:123 blocked for more than 120 seconds.
[30091.338946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[30091.338965] spl_dynamic_tas D ffff88041e213680 0 123 2 0x00000000
[30091.338967] ffff88040404bb70 0000000000000046 ffff880404692d80 ffff88040404bfd8
[30091.338969] ffff88040404bfd8 ffff88040404bfd8 ffff880404692d80 ffff88040404bcd8
[30091.338971] ffff88040404bce0 7fffffffffffffff ffff880404692d80 ffff880406b92500
[30091.338972] Call Trace:
[30091.338977] [] schedule+0x29/0x70
[30091.338980] [] schedule_timeout+0x209/0x2d0
[30091.338982] [] ? sched_clock_cpu+0xb5/0x100
[30091.338985] [] ? check_preempt_curr+0x75/0xa0
[30091.338987] [] ? ttwu_do_wakeup+0x19/0xd0
[30091.338989] [] wait_for_completion+0x116/0x170
[30091.338990] [] ? wake_up_state+0x20/0x20
[30091.339010] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30091.339013] [] kthread_create_on_node+0xa8/0x140
[30091.339016] [] ? string.isra.6+0x3b/0xf0
[30091.339022] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30091.339029] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30091.339035] [] spl_kthread_create+0x9c/0xf0 [spl]
[30091.339041] [] taskq_thread_create+0x6b/0x110 [spl]
[30091.339046] [] taskq_thread_spawn_task+0x12/0x30 [spl]
[30091.339052] [] taskq_thread+0x21e/0x420 [spl]
[30091.339053] [] ? wake_up_state+0x20/0x20
[30091.339059] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30091.339060] [] kthread+0xcf/0xe0
[30091.339062] [] ? kthread_create_on_node+0x140/0x140
[30091.339064] [] ret_from_fork+0x58/0x90
[30091.339066] [] ? kthread_create_on_node+0x140/0x140
[30211.223201] INFO: task spl_dynamic_tas:123 blocked for more than 120 seconds.
[30211.223220] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[30211.223239] spl_dynamic_tas D ffff88041e213680 0 123 2 0x00000000
[30211.223242] ffff88040404bb70 0000000000000046 ffff880404692d80 ffff88040404bfd8
[30211.223244] ffff88040404bfd8 ffff88040404bfd8 ffff880404692d80 ffff88040404bcd8
[30211.223245] ffff88040404bce0 7fffffffffffffff ffff880404692d80 ffff880406b92500
[30211.223247] Call Trace:
[30211.223252] [] schedule+0x29/0x70
[30211.223254] [] schedule_timeout+0x209/0x2d0
[30211.223257] [] ? sched_clock_cpu+0xb5/0x100
[30211.223259] [] ? check_preempt_curr+0x75/0xa0
[30211.223261] [] ? ttwu_do_wakeup+0x19/0xd0
[30211.223263] [] wait_for_completion+0x116/0x170
[30211.223264] [] ? wake_up_state+0x20/0x20
[30211.223283] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30211.223285] [] kthread_create_on_node+0xa8/0x140
[30211.223288] [] ? string.isra.6+0x3b/0xf0
[30211.223293] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30211.223298] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30211.223303] [] spl_kthread_create+0x9c/0xf0 [spl]
[30211.223307] [] taskq_thread_create+0x6b/0x110 [spl]
[30211.223312] [] taskq_thread_spawn_task+0x12/0x30 [spl]
[30211.223317] [] taskq_thread+0x21e/0x420 [spl]
[30211.223318] [] ? wake_up_state+0x20/0x20
[30211.223322] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30211.223324] [] kthread+0xcf/0xe0
[30211.223325] [] ? kthread_create_on_node+0x140/0x140
[30211.223328] [] ret_from_fork+0x58/0x90
[30211.223329] [] ? kthread_create_on_node+0x140/0x140
[30331.108219] INFO: task spl_dynamic_tas:123 blocked for more than 120 seconds.
[30331.108240] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[30331.108257] spl_dynamic_tas D ffff88041e213680 0 123 2 0x00000000
[30331.108260] ffff88040404bb70 0000000000000046 ffff880404692d80 ffff88040404bfd8
[30331.108262] ffff88040404bfd8 ffff88040404bfd8 ffff880404692d80 ffff88040404bcd8
[30331.108263] ffff88040404bce0 7fffffffffffffff ffff880404692d80 ffff880406b92500
[30331.108265] Call Trace:
[30331.108270] [] schedule+0x29/0x70
[30331.108273] [] schedule_timeout+0x209/0x2d0
[30331.108275] [] ? sched_clock_cpu+0xb5/0x100
[30331.108278] [] ? check_preempt_curr+0x75/0xa0
[30331.108279] [] ? ttwu_do_wakeup+0x19/0xd0
[30331.108281] [] wait_for_completion+0x116/0x170
[30331.108283] [] ? wake_up_state+0x20/0x20
[30331.108304] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30331.108306] [] kthread_create_on_node+0xa8/0x140
[30331.108310] [] ? string.isra.6+0x3b/0xf0
[30331.108316] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30331.108323] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30331.108329] [] spl_kthread_create+0x9c/0xf0 [spl]
[30331.108333] [] taskq_thread_create+0x6b/0x110 [spl]
[30331.108338] [] taskq_thread_spawn_task+0x12/0x30 [spl]
[30331.108343] [] taskq_thread+0x21e/0x420 [spl]
[30331.108344] [] ? wake_up_state+0x20/0x20
[30331.108349] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[30331.108350] [] kthread+0xcf/0xe0
[30331.108352] [] ? kthread_create_on_node+0x140/0x140
[30331.108354] [] ret_from_fork+0x58/0x90
[30331.108355] [] ? kthread_create_on_node+0x140/0x140
[36085.587524] INFO: task kswapd0:46 blocked for more than 120 seconds.
[36085.587544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[36085.587562] kswapd0 D ffff88041e213680 0 46 2 0x00000000
[36085.587565] ffff88040445b790 0000000000000046 ffff8804083fdb00 ffff88040445bfd8
[36085.587567] ffff88040445bfd8 ffff88040445bfd8 ffff8804083fdb00 ffff88040194a9f0
[36085.587568] ffff88040194a9f4 ffff8804083fdb00 00000000ffffffff ffff88040194a9f8
[36085.587570] Call Trace:
[36085.587576] [] schedule_preempt_disabled+0x29/0x70
[36085.587578] [] __mutex_lock_slowpath+0xc5/0x1c0
[36085.587580] [] mutex_lock+0x1f/0x2f
[36085.587616] [] arc_release+0x301/0x490 [zfs]
[36085.587633] [] dbuf_dirty+0x7c0/0x980 [zfs]
[36085.587646] [] dmu_buf_will_dirty+0x56/0xa0 [zfs]
[36085.587669] [] zap_get_leaf_byblk+0x132/0x2c0 [zfs]
[36085.587681] [] ? dmu_buf_rele+0xe/0x10 [zfs]
[36085.587700] [] ? zap_idx_to_blk+0x103/0x180 [zfs]
[36085.587718] [] zap_deref_leaf+0x7a/0xa0 [zfs]
[36085.587734] [] fzap_remove+0x3f/0xb0 [zfs]
[36085.587751] [] ? zap_name_alloc+0x73/0xd0 [zfs]
[36085.587768] [] zap_remove_norm+0x17b/0x1e0 [zfs]
[36085.587783] [] zap_remove+0x13/0x20 [zfs]
[36085.587800] [] zap_remove_int+0x54/0x80 [zfs]
[36085.587816] [] zfs_rmnode+0x224/0x350 [zfs]
[36085.587819] [] ? mutex_lock+0x12/0x2f
[36085.587836] [] zfs_zinactive+0x168/0x180 [zfs]
[36085.587852] [] zfs_inactive+0x67/0x240 [zfs]
[36085.587855] [] ? truncate_pagecache+0x59/0x60
[36085.587871] [] zpl_evict_inode+0x43/0x60 [zfs]
[36085.587875] [] evict+0xa7/0x170
[36085.587877] [] dispose_list+0x3e/0x50
[36085.587878] [] prune_icache_sb+0x163/0x320
[36085.587880] [] prune_super+0x16b/0x1a0
[36085.587882] [] shrink_slab+0x165/0x300
[36085.587884] [] ? vmpressure+0x21/0x90
[36085.587886] [] balance_pgdat+0x4b1/0x5e0
[36085.587887] [] kswapd+0x173/0x450
[36085.587890] [] ? wake_up_bit+0x30/0x30
[36085.587892] [] ? balance_pgdat+0x5e0/0x5e0
[36085.587893] [] kthread+0xcf/0xe0
[36085.587894] [] ? kthread_create_on_node+0x140/0x140
[36085.587897] [] ret_from_fork+0x58/0x90
[36085.587898] [] ? kthread_create_on_node+0x140/0x140
[36085.587912] INFO: task txg_quiesce:1089 blocked for more than 120 seconds.
[36085.587929] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[36085.587946] txg_quiesce D ffff88041e213680 0 1089 2 0x00000000
[36085.587948] ffff8803fcbe7d38 0000000000000046 ffff880406cde660 ffff8803fcbe7fd8
[36085.587949] ffff8803fcbe7fd8 ffff8803fcbe7fd8 ffff880406cde660 ffff8803fcc62148
[36085.587951] ffff8803fcc62038 ffff8803fcc62150 ffff8803fcc62060 0000000000000000
[36085.587952] Call Trace:
[36085.587954] [] schedule+0x29/0x70
[36085.587961] [] cv_wait_common+0x125/0x150 [spl]
[36085.587963] [] ? wake_up_bit+0x30/0x30
[36085.587968] [] __cv_wait+0x15/0x20 [spl]
[36085.587984] [] txg_quiesce_thread+0x2eb/0x400 [zfs]
[36085.588000] [] ? txg_init+0x2b0/0x2b0 [zfs]
[36085.588005] [] thread_generic_wrapper+0x71/0x80 [spl]
[36085.588009] [] ? __thread_exit+0x20/0x20 [spl]
[36085.588010] [] kthread+0xcf/0xe0
[36085.588012] [] ? kthread_create_on_node+0x140/0x140
[36085.588014] [] ret_from_fork+0x58/0x90
[36085.588015] [] ? kthread_create_on_node+0x140/0x140

behlendorf · 2015-10-02T22:27:17Z

@rgmiller sorry to hear that. It appears that your still able to tickle a case where dynamically creating threads can wedge the system when there's significant memory pressure. Please set spl_taskq_thread_dynamic=0 and let us know if that resolves the issue.

rgmiller · 2015-10-03T16:31:44Z

OK. I've set spl_taskq_thread_dynamic to 0. I'll post a follow-up after the server has run for a couple of days (or if it hangs).

rgmiller · 2015-10-05T15:37:24Z

OK, the server has been running without major issues for 2 days now, so it appears that setting spl_taskq_thread_dynamic=0 is a reasonable workaround for now.

FWIW: things aren't perfect. I've got some anecdotal evidence that other processes can become starved for I/O when the BackupPC jobs are running. Specifically, I have a streaming music server on that machine that reads from a different zpool. I don't have any stack traces because nothing crashed, but it was definitely stuck in an I/O Wait state long enough for the individual players to play through their internal buffers and then stop. Both times this happened, it was right after a BackupPC job started.

behlendorf · 2015-10-05T16:16:54Z

@rgmiller thanks for following up, this is exactly the kind of feedback we need to get. My suggestion would be to stick with spl_taskq_thread_dynamic=0 for now, it may become the default value in a future point release if we can't resolve the core issue with creating threads on demand your seeing.

mihaivint · 2015-10-06T16:04:01Z

Not sure if this helps or not but initial problem was also linked to my issue 3834. I'm still on 0.6.5.1 and managed to work around it by setting vm.min_free_kbytes to 1048576. Didn't try a lower value.

rgmiller · 2015-10-26T03:37:35Z

Just a quick follow-up: zfs locked up again tonight. I'm actually running 0.6.5.3 at this point (with spl_taskq_thread_dynamic set to 0). It had been running find since I set spl_taskq_thread_dynamic manually after @behlendorf suggested it. So, it appears that we've still got issues; they're just much less likely. dmesg text pasted below.

[430985.003872] INFO: task z_wr_iss:2300 blocked for more than 120 seconds.
[430985.003893] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[430985.003912] z_wr_iss D ffff88041e213680 0 2300 2 0x00000000
[430985.003915] ffff88040035baf0 0000000000000046 ffff88040705cfa0 ffff88040035bfd8
[430985.003917] ffff88040035bfd8 ffff88040035bfd8 ffff88040705cfa0 ffff8803f99d09d0
[430985.003918] ffff8803f99d09d4 ffff88040705cfa0 00000000ffffffff ffff8803f99d09d8
[430985.003920] Call Trace:
[430985.003925] [] schedule_preempt_disabled+0x29/0x70
[430985.003927] [] __mutex_lock_slowpath+0xc5/0x1c0
[430985.003929] [] mutex_lock+0x1f/0x2f
[430985.003969] [] vdev_queue_io+0x88/0x200 [zfs]
[430985.003992] [] ? zio_create+0x3fa/0x500 [zfs]
[430985.004012] [] zio_vdev_io_start+0x187/0x2e0 [zfs]
[430985.004031] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004051] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004069] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004087] [] ? zio_push_transform+0x39/0x90 [zfs]
[430985.004107] [] zio_vdev_io_start+0x9f/0x2e0 [zfs]
[430985.004123] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004141] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004157] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004174] [] zio_vdev_io_start+0x1dd/0x2e0 [zfs]
[430985.004189] [] zio_execute+0xc8/0x180 [zfs]
[430985.004196] [] taskq_thread+0x21e/0x420 [spl]
[430985.004199] [] ? wake_up_state+0x20/0x20
[430985.004203] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[430985.004205] [] kthread+0xcf/0xe0
[430985.004206] [] ? kthread_create_on_node+0x140/0x140
[430985.004209] [] ret_from_fork+0x58/0x90
[430985.004210] [] ? kthread_create_on_node+0x140/0x140
[430985.004212] INFO: task z_wr_iss:2302 blocked for more than 120 seconds.
[430985.004228] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[430985.004246] z_wr_iss D ffff88041e293680 0 2302 2 0x00000000
[430985.004248] ffff88040034baf0 0000000000000046 ffff88040705f1c0 ffff88040034bfd8
[430985.004250] ffff88040034bfd8 ffff88040034bfd8 ffff88040705f1c0 ffff8803f99d09d0
[430985.004251] ffff8803f99d09d4 ffff88040705f1c0 00000000ffffffff ffff8803f99d09d8
[430985.004252] Call Trace:
[430985.004255] [] schedule_preempt_disabled+0x29/0x70
[430985.004256] [] __mutex_lock_slowpath+0xc5/0x1c0
[430985.004258] [] mutex_lock+0x1f/0x2f
[430985.004274] [] vdev_queue_io+0x88/0x200 [zfs]
[430985.004290] [] ? zio_create+0x3fa/0x500 [zfs]
[430985.004305] [] zio_vdev_io_start+0x187/0x2e0 [zfs]
[430985.004319] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004335] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004350] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004365] [] ? zio_push_transform+0x39/0x90 [zfs]
[430985.004379] [] zio_vdev_io_start+0x9f/0x2e0 [zfs]
[430985.004393] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004408] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004423] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004438] [] zio_vdev_io_start+0x1dd/0x2e0 [zfs]
[430985.004452] [] zio_execute+0xc8/0x180 [zfs]
[430985.004456] [] taskq_thread+0x21e/0x420 [spl]
[430985.004458] [] ? wake_up_state+0x20/0x20
[430985.004462] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[430985.004464] [] kthread+0xcf/0xe0
[430985.004465] [] ? kthread_create_on_node+0x140/0x140
[430985.004467] [] ret_from_fork+0x58/0x90
[430985.004468] [] ? kthread_create_on_node+0x140/0x140
[430985.004485] INFO: task txg_sync:2586 blocked for more than 120 seconds.
[430985.004501] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[430985.004519] txg_sync D ffff88041e293680 0 2586 2 0x00000000
[430985.004520] ffff8803f8e97b90 0000000000000046 ffff8804006fa220 ffff8803f8e97fd8
[430985.004521] ffff8803f8e97fd8 ffff8803f8e97fd8 ffff8804006fa220 ffff88041e293f48
[430985.004523] ffff880366f97088 ffff880366f970c8 ffff880366f970b0 0000000000000001
[430985.004524] Call Trace:
[430985.004526] [] io_schedule+0x9d/0x130
[430985.004531] [] cv_wait_common+0xae/0x150 [spl]
[430985.004533] [] ? wake_up_bit+0x30/0x30
[430985.004538] [] __cv_wait_io+0x18/0x20 [spl]
[430985.004552] [] zio_wait+0x123/0x210 [zfs]
[430985.004566] [] dsl_pool_sync+0xbf/0x420 [zfs]
[430985.004582] [] spa_sync+0x388/0xb70 [zfs]
[430985.004583] [] ? autoremove_wake_function+0x2b/0x40
[430985.004600] [] txg_sync_thread+0x3cc/0x640 [zfs]
[430985.004615] [] ? txg_fini+0x2a0/0x2a0 [zfs]
[430985.004620] [] thread_generic_wrapper+0x71/0x80 [spl]
[430985.004624] [] ? __thread_exit+0x20/0x20 [spl]
[430985.004625] [] kthread+0xcf/0xe0
[430985.004627] [] ? kthread_create_on_node+0x140/0x140
[430985.004629] [] ret_from_fork+0x58/0x90
[430985.004630] [] ? kthread_create_on_node+0x140/0x140
[430985.004658] INFO: task BackupPC_trashC:5245 blocked for more than 120 seconds.
[430985.004675] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[430985.004693] BackupPC_trashC D ffff88041e393680 0 5245 5241 0x00000080
[430985.004695] ffff8803db4f3530 0000000000000082 ffff8803f98bc440 ffff8803db4f3fd8
[430985.004696] ffff8803db4f3fd8 ffff8803db4f3fd8 ffff8803f98bc440 ffff8803f99d59d0
[430985.004697] ffff8803f99d59d4 ffff8803f98bc440 00000000ffffffff ffff8803f99d59d8
[430985.004699] Call Trace:
[430985.004701] [] schedule_preempt_disabled+0x29/0x70
[430985.004702] [] __mutex_lock_slowpath+0xc5/0x1c0
[430985.004704] [] mutex_lock+0x1f/0x2f
[430985.004720] [] vdev_queue_io+0x88/0x200 [zfs]
[430985.004736] [] zio_vdev_io_start+0x187/0x2e0 [zfs]
[430985.004752] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004767] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004782] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004798] [] ? zio_push_transform+0x39/0x90 [zfs]
[430985.004813] [] zio_vdev_io_start+0x9f/0x2e0 [zfs]
[430985.004828] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004843] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.004858] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.004872] [] zio_vdev_io_start+0x1dd/0x2e0 [zfs]
[430985.004887] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.004896] [] arc_read+0x603/0xad0 [zfs]
[430985.004907] [] dbuf_read+0x2ad/0x930 [zfs]
[430985.004919] [] dmu_buf_hold+0x50/0x80 [zfs]
[430985.004934] [] zap_get_leaf_byblk+0x5c/0x2c0 [zfs]
[430985.004944] [] ? dmu_buf_rele+0xe/0x10 [zfs]
[430985.004958] [] ? zap_idx_to_blk+0x103/0x180 [zfs]
[430985.004973] [] zap_deref_leaf+0x7a/0xa0 [zfs]
[430985.004988] [] fzap_add_cd+0x55/0x140 [zfs]
[430985.004993] [] ? spl_kmem_alloc+0xc7/0x170 [spl]
[430985.005007] [] fzap_add+0x6a/0x80 [zfs]
[430985.005022] [] zap_add+0x126/0x180 [zfs]
[430985.005037] [] zap_add_int+0x68/0x90 [zfs]
[430985.005053] [] zfs_unlinked_add+0x3c/0x90 [zfs]
[430985.005067] [] zfs_link_destroy+0x54d/0x5a0 [zfs]
[430985.005083] [] zfs_rmdir+0x499/0x650 [zfs]
[430985.005098] [] zpl_rmdir+0x56/0x80 [zfs]
[430985.005103] [] vfs_rmdir+0xa8/0x100
[430985.005105] [] do_rmdir+0x1a5/0x200
[430985.005107] [] ? mntput_no_expire+0x3e/0x120
[430985.005109] [] SyS_rmdir+0x16/0x20
[430985.005111] [] system_call_fastpath+0x16/0x1b
[430985.005117] INFO: task BackupPC_dump:5147 blocked for more than 120 seconds.
[430985.005133] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[430985.005151] BackupPC_dump D ffff88041e313680 0 5147 5082 0x00000080
[430985.005153] ffff880070aff670 0000000000000082 ffff8800c3814fa0 ffff880070afffd8
[430985.005154] ffff880070afffd8 ffff880070afffd8 ffff8800c3814fa0 ffff8803f99d09d0
[430985.005156] ffff8803f99d09d4 ffff8800c3814fa0 00000000ffffffff ffff8803f99d09d8
[430985.005157] Call Trace:
[430985.005159] [] schedule_preempt_disabled+0x29/0x70
[430985.005161] [] __mutex_lock_slowpath+0xc5/0x1c0
[430985.005162] [] mutex_lock+0x1f/0x2f
[430985.005178] [] vdev_queue_io+0x88/0x200 [zfs]
[430985.005193] [] ? zio_create+0x3fa/0x500 [zfs]
[430985.005208] [] zio_vdev_io_start+0x187/0x2e0 [zfs]
[430985.005223] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.005238] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.005253] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.005268] [] ? zio_push_transform+0x39/0x90 [zfs]
[430985.005283] [] zio_vdev_io_start+0x9f/0x2e0 [zfs]
[430985.005297] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.005312] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[430985.005326] [] ? vdev_config_sync+0x140/0x140 [zfs]
[430985.005341] [] zio_vdev_io_start+0x1dd/0x2e0 [zfs]
[430985.005363] [] zio_nowait+0xc6/0x1b0 [zfs]
[430985.005372] [] arc_read+0x603/0xad0 [zfs]
[430985.005383] [] dbuf_prefetch+0x1de/0x2e0 [zfs]
[430985.005394] [] dmu_zfetch_dofetch.isra.6+0x114/0x180 [zfs]
[430985.005406] [] dmu_zfetch+0x4ef/0xe00 [zfs]
[430985.005415] [] dbuf_read+0x3f7/0x930 [zfs]
[430985.005425] [] dmu_buf_hold_array_by_dnode+0x123/0x4d0 [zfs]
[430985.005437] [] dmu_read_uio_dnode+0x52/0x100 [zfs]
[430985.005447] [] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[430985.005462] [] zfs_read+0x13d/0x400 [zfs]
[430985.005477] [] zpl_read_common_iovec.constprop.10+0x80/0xd0 [zfs]
[430985.005491] [] zpl_read+0x7e/0xb0 [zfs]
[430985.005493] [] vfs_read+0x9c/0x170
[430985.005495] [] SyS_read+0x58/0xb0
[430985.005497] [] system_call_fastpath+0x16/0x1b
[431104.889511] INFO: task txg_sync:2586 blocked for more than 120 seconds.
[431104.889530] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[431104.889549] txg_sync D ffff88041e293680 0 2586 2 0x00000000
[431104.889552] ffff8803f8e97b90 0000000000000046 ffff8804006fa220 ffff8803f8e97fd8
[431104.889554] ffff8803f8e97fd8 ffff8803f8e97fd8 ffff8804006fa220 ffff88041e293f48
[431104.889555] ffff880366f97088 ffff880366f970c8 ffff880366f970b0 0000000000000001
[431104.889557] Call Trace:
[431104.889562] [] io_schedule+0x9d/0x130
[431104.889581] [] cv_wait_common+0xae/0x150 [spl]
[431104.889584] [] ? wake_up_bit+0x30/0x30
[431104.889590] [] __cv_wait_io+0x18/0x20 [spl]
[431104.889617] [] zio_wait+0x123/0x210 [zfs]
[431104.889635] [] dsl_pool_sync+0xbf/0x420 [zfs]
[431104.889654] [] spa_sync+0x388/0xb70 [zfs]
[431104.889656] [] ? autoremove_wake_function+0x2b/0x40
[431104.889675] [] txg_sync_thread+0x3cc/0x640 [zfs]
[431104.889692] [] ? txg_fini+0x2a0/0x2a0 [zfs]
[431104.889697] [] thread_generic_wrapper+0x71/0x80 [spl]
[431104.889702] [] ? __thread_exit+0x20/0x20 [spl]
[431104.889703] [] kthread+0xcf/0xe0
[431104.889705] [] ? kthread_create_on_node+0x140/0x140
[431104.889707] [] ret_from_fork+0x58/0x90
[431104.889709] [] ? kthread_create_on_node+0x140/0x140
[431224.773846] INFO: task spl_kmem_cache:115 blocked for more than 120 seconds.
[431224.773866] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[431224.773885] spl_kmem_cache D ffff88041e213680 0 115 2 0x00000000
[431224.773887] ffff8800369636b0 0000000000000046 ffff880036958000 ffff880036963fd8
[431224.773889] ffff880036963fd8 ffff880036963fd8 ffff880036958000 ffff8800c7d8ab68
[431224.773891] ffff8800c7d8aa20 ffff8800c7d8ab70 ffff8800c7d8aa48 0000000000000000
[431224.773892] Call Trace:
[431224.773898] [] schedule+0x29/0x70
[431224.773917] [] cv_wait_common+0x125/0x150 [spl]
[431224.773921] [] ? wake_up_bit+0x30/0x30
[431224.773926] [] __cv_wait+0x15/0x20 [spl]
[431224.773953] [] txg_wait_open+0xc3/0x110 [zfs]
[431224.773969] [] dmu_tx_wait+0x3a8/0x3c0 [zfs]
[431224.773982] [] dmu_tx_assign+0x9a/0x510 [zfs]
[431224.774002] [] zfs_inactive+0xda/0x240 [zfs]
[431224.774004] [] ? truncate_pagecache+0x59/0x60
[431224.774022] [] zpl_evict_inode+0x43/0x60 [zfs]
[431224.774027] [] evict+0xa7/0x170
[431224.774029] [] dispose_list+0x3e/0x50
[431224.774030] [] prune_icache_sb+0x163/0x320
[431224.774032] [] prune_super+0x16b/0x1a0
[431224.774034] [] shrink_slab+0x165/0x300
[431224.774036] [] ? vmpressure+0x21/0x90
[431224.774038] [] do_try_to_free_pages+0x3c2/0x4e0
[431224.774039] [] try_to_free_pages+0xfc/0x180
[431224.774042] [] __alloc_pages_nodemask+0x7fd/0xb90
[431224.774045] [] alloc_pages_current+0xa9/0x170
[431224.774047] [] __vmalloc_node_range+0x15b/0x270
[431224.774052] [] ? spl_vmalloc+0x34/0x60 [spl]
[431224.774054] [] __vmalloc+0x41/0x50
[431224.774058] [] ? spl_vmalloc+0x34/0x60 [spl]
[431224.774062] [] spl_vmalloc+0x34/0x60 [spl]
[431224.774067] [] kv_alloc.isra.5+0x87/0x90 [spl]
[431224.774071] [] spl_cache_grow_work+0x4f/0x2d0 [spl]
[431224.774075] [] taskq_thread+0x21e/0x420 [spl]
[431224.774077] [] ? wake_up_state+0x20/0x20
[431224.774082] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[431224.774084] [] kthread+0xcf/0xe0
[431224.774085] [] ? kthread_create_on_node+0x140/0x140
[431224.774088] [] ret_from_fork+0x58/0x90
[431224.774089] [] ? kthread_create_on_node+0x140/0x140
[431224.774090] INFO: task spl_kmem_cache:117 blocked for more than 120 seconds.
[431224.774108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[431224.774126] spl_kmem_cache D ffff88041e313680 0 117 2 0x00000000
[431224.774128] ffff88003696b6b0 0000000000000046 ffff8800369596c0 ffff88003696bfd8
[431224.774129] ffff88003696bfd8 ffff88003696bfd8 ffff8800369596c0 ffff8800c7d8ab68
[431224.774130] ffff8800c7d8aa20 ffff8800c7d8ab70 ffff8800c7d8aa48 0000000000000000
[431224.774132] Call Trace:
[431224.774134] [] schedule+0x29/0x70
[431224.774139] [] cv_wait_common+0x125/0x150 [spl]
[431224.774141] [] ? wake_up_bit+0x30/0x30
[431224.774145] [] __cv_wait+0x15/0x20 [spl]
[431224.774163] [] txg_wait_open+0xc3/0x110 [zfs]
[431224.774176] [] dmu_tx_wait+0x3a8/0x3c0 [zfs]
[431224.774189] [] dmu_tx_assign+0x9a/0x510 [zfs]
[431224.774206] [] zfs_inactive+0xda/0x240 [zfs]
[431224.774208] [] ? truncate_pagecache+0x59/0x60
[431224.774224] [] zpl_evict_inode+0x43/0x60 [zfs]
[431224.774227] [] evict+0xa7/0x170
[431224.774229] [] dispose_list+0x3e/0x50
[431224.774231] [] prune_icache_sb+0x163/0x320
[431224.774232] [] prune_super+0x16b/0x1a0
[431224.774234] [] shrink_slab+0x165/0x300
[431224.774235] [] ? vmpressure+0x21/0x90
[431224.774237] [] do_try_to_free_pages+0x3c2/0x4e0
[431224.774239] [] try_to_free_pages+0xfc/0x180
[431224.774241] [] __alloc_pages_nodemask+0x7fd/0xb90
[431224.774243] [] alloc_pages_current+0xa9/0x170
[431224.774245] [] __vmalloc_node_range+0x15b/0x270
[431224.774249] [] ? spl_vmalloc+0x34/0x60 [spl]
[431224.774251] [] __vmalloc+0x41/0x50
[431224.774255] [] ? spl_vmalloc+0x34/0x60 [spl]
[431224.774259] [] spl_vmalloc+0x34/0x60 [spl]
[431224.774264] [] kv_alloc.isra.5+0x87/0x90 [spl]
[431224.774268] [] spl_cache_grow_work+0x4f/0x2d0 [spl]
[431224.774272] [] taskq_thread+0x21e/0x420 [spl]
[431224.774274] [] ? wake_up_state+0x20/0x20
[431224.774278] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[431224.774280] [] kthread+0xcf/0xe0
[431224.774281] [] ? kthread_create_on_node+0x140/0x140
[431224.774283] [] ret_from_fork+0x58/0x90
[431224.774284] [] ? kthread_create_on_node+0x140/0x140
[431224.774305] INFO: task z_wr_iss:2301 blocked for more than 120 seconds.
[431224.774321] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[431224.774340] z_wr_iss D ffff88041e393680 0 2301 2 0x00000000
[431224.774341] ffff88040035faf0 0000000000000046 ffff88040705ad80 ffff88040035ffd8
[431224.774342] ffff88040035ffd8 ffff88040035ffd8 ffff88040705ad80 ffff8803f99d09d0
[431224.774344] ffff8803f99d09d4 ffff88040705ad80 00000000ffffffff ffff8803f99d09d8
[431224.774345] Call Trace:
[431224.774347] [] schedule_preempt_disabled+0x29/0x70
[431224.774349] [] __mutex_lock_slowpath+0xc5/0x1c0
[431224.774351] [] mutex_lock+0x1f/0x2f
[431224.774368] [] vdev_queue_io+0x88/0x200 [zfs]
[431224.774385] [] ? zio_create+0x3fa/0x500 [zfs]
[431224.774401] [] zio_vdev_io_start+0x187/0x2e0 [zfs]
[431224.774417] [] zio_nowait+0xc6/0x1b0 [zfs]
[431224.774434] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[431224.774450] [] ? vdev_config_sync+0x140/0x140 [zfs]
[431224.774465] [] ? zio_push_transform+0x39/0x90 [zfs]
[431224.774484] [] zio_vdev_io_start+0x9f/0x2e0 [zfs]
[431224.774503] [] zio_nowait+0xc6/0x1b0 [zfs]
[431224.774522] [] vdev_mirror_io_start+0xa7/0x1a0 [zfs]
[431224.774541] [] ? vdev_config_sync+0x140/0x140 [zfs]
[431224.774560] [] zio_vdev_io_start+0x1dd/0x2e0 [zfs]
[431224.774579] [] zio_execute+0xc8/0x180 [zfs]
[431224.774584] [] taskq_thread+0x21e/0x420 [spl]
[431224.774586] [] ? wake_up_state+0x20/0x20
[431224.774591] [] ? taskq_thread_spawn+0x60/0x60 [spl]
[431224.774592] [] kthread+0xcf/0xe0
[431224.774594] [] ? kthread_create_on_node+0x140/0x140
[431224.774596] [] ret_from_fork+0x58/0x90
[431224.774598] [] ? kthread_create_on_node+0x140/0x140
[431224.774613] INFO: task txg_sync:2586 blocked for more than 120 seconds.
[431224.774633] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[431224.774656] txg_sync D ffff88041e293680 0 2586 2 0x00000000
[431224.774658] ffff8803f8e97b90 0000000000000046 ffff8804006fa220 ffff8803f8e97fd8
[431224.774659] ffff8803f8e97fd8 ffff8803f8e97fd8 ffff8804006fa220 ffff88041e293f48
[431224.774661] ffff880366f97088 ffff880366f970c8 ffff880366f970b0 0000000000000001
[431224.774663] Call Trace:
[431224.774665] [] io_schedule+0x9d/0x130
[431224.774670] [] cv_wait_common+0xae/0x150 [spl]
[431224.774672] [] ? wake_up_bit+0x30/0x30
[431224.774677] [] __cv_wait_io+0x18/0x20 [spl]
[431224.774696] [] zio_wait+0x123/0x210 [zfs]
[431224.774713] [] dsl_pool_sync+0xbf/0x420 [zfs]
[431224.774732] [] spa_sync+0x388/0xb70 [zfs]
[431224.774734] [] ? autoremove_wake_function+0x2b/0x40
[431224.774754] [] txg_sync_thread+0x3cc/0x640 [zfs]
[431224.774773] [] ? txg_fini+0x2a0/0x2a0 [zfs]
[431224.774779] [] thread_generic_wrapper+0x71/0x80 [spl]
[431224.774784] [] ? __thread_exit+0x20/0x20 [spl]
[431224.774785] [] kthread+0xcf/0xe0
[431224.774787] [] ? kthread_create_on_node+0x140/0x140
[431224.774789] [] ret_from_fork+0x58/0x90
[431224.774790] [] ? kthread_create_on_node+0x140/0x140

rgmiller · 2015-11-09T17:41:02Z

Just a quick note to say that I experienced another lockup over the weekend. (This is on 0.6.5.3 with spl_taskq_thread_dynamic set to 0.) I have the stack traces from dmesg, but they don't look too different from the ones I've already posted, so I'm not going to bother unless someone wants them.

behlendorf · 2015-12-16T18:52:43Z

@rgmiller sorry about the slow reply. The good news is that thanks to the stacks you posted I'm pretty sure I've identified the deadlock you're hitting. If I'm correct you should be able to prevent it by setting the zfs_vdev_aggregation_limit module option to 16384 or lower while I work up a patch. This will disable aggregation for larger blocks which is where this deadlock can occur due to a zio buffer allocation.

rgmiller · 2015-12-16T19:20:39Z

Great. I'll give it a try this evening. What should I do about the spl_taskq_thread_dynamic setting? It's currently at 0. I assume I should turn it back on again, yes?

This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either success or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_vmalloc ... evict ... zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#3808 Issue openzfs#3867

behlendorf · 2015-12-16T20:05:04Z

@rgmiller you should be able to leave the default. For 0.6.5.3 this is spl_taskq_thread_dynamic=0, if your running master spl_taskq_thread_dynamic=1 but a patch has been applied which resolves the root cause correctly. Or if you could verify the patch in #4111 without modifying zfs_vdev_aggregation_limit that would be great.

This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#3808 Issue openzfs#3867

rgmiller · 2015-12-19T04:23:49Z

Just a quick update: I installed patch #4111 this evening and it's running now. The overnight backup jobs are what normally caused problems, so I'll leave another update after it's run for a couple more days.

rgmiller · 2015-12-21T15:55:07Z

The overnight backup jobs have run 3 times now without problems, so it looks like the patch is solid.

behlendorf · 2015-12-21T17:10:11Z

@rgmiller excellent news, thanks for following up.

This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #3808 Closes #3867

blballa · 2016-01-04T13:57:12Z

First, I want to thank everyone for their hard work. Code is awesome.
I can confirm that this bug exists with 6.5.3 whenever we put zfs under heavy load. like rsync and large file copy over samba. We went ahead and pull head from git and we can confirm, that we are no longer getting kernel panics, requiring hard reboot of the servers. It might make sense to release a 6.5.4 so that "official packages" can get build for ubuntu, since right now if you upgrade from 6.4 train which was stable, you will most likely get deadlocks. Let me know if you need any more information about our setup or logs.

behlendorf · 2016-01-04T17:06:30Z

Thanks for the confirmation, an 0.6.5.4 is definitely in the works.

This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes openzfs#3808 Closes openzfs#3867

mw171 mentioned this issue Oct 2, 2015

0.6.5.2 hangs on send | receive #3872

Closed

behlendorf added this to the 0.7.0 milestone Oct 2, 2015

behlendorf added the Bug - Major label Oct 2, 2015

rgmiller closed this as completed Oct 26, 2015

rgmiller reopened this Oct 26, 2015

koptein mentioned this issue Oct 30, 2015

zfs stuck #3947

Closed

behlendorf mentioned this issue Dec 16, 2015

Deadlock due to filesystem re-entry during zfs_evict_inode() in 0.6.5 #3808

Closed

behlendorf mentioned this issue Dec 16, 2015

Fix vdev_queue_aggregate() deadlock #4111

Closed

behlendorf closed this as completed in 6fe5378 Dec 18, 2015

Rovanion mentioned this issue Feb 4, 2016

Scrub stopping in the middle of run #4307

Closed

behlendorf modified the milestones: 0.6.5.4, 0.7.0 Mar 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.6.5.2 hangs on large rsync jobs #3867

0.6.5.2 hangs on large rsync jobs #3867

rgmiller commented Oct 1, 2015

rgmiller commented Oct 1, 2015

behlendorf commented Oct 2, 2015

rgmiller commented Oct 3, 2015

rgmiller commented Oct 5, 2015

behlendorf commented Oct 5, 2015

mihaivint commented Oct 6, 2015

rgmiller commented Oct 26, 2015

rgmiller commented Nov 9, 2015

behlendorf commented Dec 16, 2015

rgmiller commented Dec 16, 2015

behlendorf commented Dec 16, 2015

rgmiller commented Dec 19, 2015

rgmiller commented Dec 21, 2015

behlendorf commented Dec 21, 2015

blballa commented Jan 4, 2016

behlendorf commented Jan 4, 2016

0.6.5.2 hangs on large rsync jobs #3867

0.6.5.2 hangs on large rsync jobs #3867

Comments

rgmiller commented Oct 1, 2015

rgmiller commented Oct 1, 2015

behlendorf commented Oct 2, 2015

rgmiller commented Oct 3, 2015

rgmiller commented Oct 5, 2015

behlendorf commented Oct 5, 2015

mihaivint commented Oct 6, 2015

rgmiller commented Oct 26, 2015

rgmiller commented Nov 9, 2015

behlendorf commented Dec 16, 2015

rgmiller commented Dec 16, 2015

behlendorf commented Dec 16, 2015

rgmiller commented Dec 19, 2015

rgmiller commented Dec 21, 2015

behlendorf commented Dec 21, 2015

blballa commented Jan 4, 2016

behlendorf commented Jan 4, 2016