Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

long hung tasks #796

Closed
mooinglemur opened this issue Jun 23, 2012 · 6 comments
Closed

long hung tasks #796

mooinglemur opened this issue Jun 23, 2012 · 6 comments
Milestone

Comments

@mooinglemur
Copy link

Filing bug at request of ryao.

Kernel 3.4.1-vs2.3.3.4 with 3.4 patch from https://bugs.gentoo.org/show_bug.cgi?id=416685

Using git sources of both spl and zfs.

I rebooted into this kernel, and my scrub continued. I then intiated a zfs destroy -r on a fs with a few dozen snapshots, and then rm -r a large tree in a different fs on the same pool. I got some hung tasks for several minutes, but it eventually unfroze and continued.

pool: bkup0
state: ONLINE
scan: scrub in progress since Sun Jun 17 00:50:01 2012
1.59T scanned out of 10.9T at 1.57M/s, (scan is slow, no estimated time)
0 repaired, 14.58% done
config:

    NAME                STATE     READ WRITE CKSUM
    bkup0               ONLINE       0     0     0
      raidz2-0          ONLINE       0     0     0
        dm-name-zfs0_1  ONLINE       0     0     0
        dm-name-zfs0_2  ONLINE       0     0     0
        dm-name-zfs0_3  ONLINE       0     0     0
        dm-name-zfs0_4  ONLINE       0     0     0
        dm-name-zfs0_5  ONLINE       0     0     0
        dm-name-zfs0_6  ONLINE       0     0     0
        dm-name-zfs0_7  ONLINE       0     0     0
        dm-name-zfs0_8  ONLINE       0     0     0

errors: No known data errors

pool: bkup1
state: ONLINE
scan: resilvered 324G in 42h22m with 0 errors on Fri Jun 15 20:11:17 2012
config:

    NAME                                             STATE     READ WRITE CKSUM
    bkup1                                            ONLINE       0     0     0
      raidz1-0                                       ONLINE       0     0     0
        scsi-SATA_ST31000340AS_5QJ0LDCZ-part1        ONLINE       0     0     0
        scsi-SATA_ST31000340NS_9QJ0DJ64-part1        ONLINE       0     0     0
        scsi-SATA_ST31000340NS_3QJ09MHL-part1        ONLINE       0     0     0
        scsi-SATA_ST31000333AS_9TE1HKZ4-part1        ONLINE       0     0     0
        scsi-SATA_ST31000340AS_9QJ1HVQ9-part1        ONLINE       0     0     0
    spares
      scsi-SATA_SAMSUNG_HD103UJS13PJ1LS625210-part1  AVAIL   

errors: No known data errors

[ 599.525371] INFO: task txg_sync:6809 blocked for more than 120 seconds.
[ 599.525375] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 599.525379] txg_sync D ffff88053fcd2400 0 6809 2 0x00000000
[ 599.525387] ffff8805008ef9e0 0000000000000046 ffff880521e82b60 0000000000012400
[ 599.525395] ffff8805008effd8 ffff8805008ee000 0000000000012400 0000000000004000
[ 599.525401] ffff8805008effd8 0000000000012400 ffff88052a208ec0 ffff880521e82b60
[ 599.525408] Call Trace:
[ 599.525419] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 599.525427] [] ? try_to_wake_up+0xce/0x2a0
[ 599.525433] [] schedule+0x24/0x70
[ 599.525443] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 599.525446] [] ? wake_up_bit+0x40/0x40
[ 599.525449] [] __cv_wait+0xe/0x10 [spl]
[ 599.525457] [] zio_wait+0xf3/0x1b0 [zfs]
[ 599.525465] [] dsl_dataset_destroy_sync+0xa7c/0x1080 [zfs]
[ 599.525468] [] ? __cv_timedwait+0xc5/0x1a0 [spl]
[ 599.525477] [] dsl_sync_task_group_sync+0x123/0x3d0 [zfs]
[ 599.525485] [] dsl_pool_sync+0x20b/0x490 [zfs]
[ 599.525496] [] spa_sync+0x3a2/0x10b0 [zfs]
[ 599.525504] [] txg_init+0x4cb/0x9b0 [zfs]
[ 599.525513] [] ? txg_init+0x210/0x9b0 [zfs]
[ 599.525515] [] ? __thread_create+0x310/0x3a0 [spl]
[ 599.525518] [] __thread_create+0x383/0x3a0 [spl]
[ 599.525520] [] ? __thread_create+0x310/0x3a0 [spl]
[ 599.525522] [] kthread+0x96/0xa0
[ 599.525525] [] kernel_thread_helper+0x4/0x10
[ 599.525527] [] ? retint_restore_args+0xe/0xe
[ 599.525529] [] ? kthread_worker_fn+0x140/0x140
[ 599.525531] [] ? gs_change+0xb/0xb
[ 599.525548] INFO: task zfs:10166 blocked for more than 120 seconds.
[ 599.525549] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 599.525550] zfs D ffff88053fc52400 0 10166 10125 0x00000000
[ 599.525552] ffff8804d717fbe8 0000000000000086 ffff880521e824c0 0000000000012400
[ 599.525555] ffff8804d717ffd8 ffff8804d717e000 0000000000012400 0000000000004000
[ 599.525557] ffff8804d717ffd8 0000000000012400 ffff88052a1380c0 ffff880521e824c0
[ 599.525560] Call Trace:
[ 599.525562] [] ? spl_debug_get_subsys+0x51/0x460 [spl]
[ 599.525565] [] ? spl_debug_msg+0x413/0x970 [spl]
[ 599.525567] [] schedule+0x24/0x70
[ 599.525569] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 599.525571] [] ? wake_up_bit+0x40/0x40
[ 599.525573] [] __cv_wait+0xe/0x10 [spl]
[ 599.525581] [] txg_wait_synced+0xb3/0x190 [zfs]
[ 599.525590] [] dsl_sync_task_group_wait+0x109/0x240 [zfs]
[ 599.525597] [] ? dsl_dataset_disown+0x240/0x240 [zfs]
[ 599.525604] [] ? dsl_destroy_inconsistent+0x1c0/0x1c0 [zfs]
[ 599.525612] [] dsl_sync_task_do+0x54/0x80 [zfs]
[ 599.525620] [] dsl_dataset_destroy+0x132/0x490 [zfs]
[ 599.525627] [] ? dsl_dataset_tryown+0x49/0x120 [zfs]
[ 599.525633] [] dmu_objset_destroy+0x36/0x40 [zfs]
[ 599.525639] [] zfs_unmount_snap+0x253/0x480 [zfs]
[ 599.525646] [] pool_status_check+0x196/0x270 [zfs]
[ 599.525648] [] do_vfs_ioctl+0x96/0x500
[ 599.525650] [] ? alloc_fd+0x47/0x140
[ 599.525652] [] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 599.525654] [] sys_ioctl+0x4a/0x80
[ 599.525656] [] system_call_fastpath+0x16/0x1b
[ 599.525658] INFO: task rm:10209 blocked for more than 120 seconds.
[ 599.525659] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 599.525660] rm D ffff88053fc52400 0 10209 10168 0x00000000
[ 599.525662] ffff8804dfff5b48 0000000000000086 ffff880515a5b7e0 0000000000012400
[ 599.525664] ffff8804dfff5fd8 ffff8804dfff4000 0000000000012400 0000000000004000
[ 599.525667] ffff8804dfff5fd8 0000000000012400 ffff88052a1380c0 ffff880515a5b7e0
[ 599.525669] Call Trace:
[ 599.525671] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 599.525673] [] ? try_to_wake_up+0xce/0x2a0
[ 599.525675] [] schedule+0x24/0x70
[ 599.525677] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 599.525679] [] ? wake_up_bit+0x40/0x40
[ 599.525681] [] __cv_wait+0xe/0x10 [spl]
[ 599.525689] [] txg_wait_open+0x83/0x110 [zfs]
[ 599.525696] [] dmu_tx_wait+0xed/0xf0 [zfs]
[ 599.525703] [] dmu_tx_assign+0x86/0xe90 [zfs]
[ 599.525708] [] ? dmu_buf_rele+0x2b/0x40 [zfs]
[ 599.525714] [] zfs_rmnode+0x127/0x340 [zfs]
[ 599.525720] [] zfs_zinactive+0x8b/0x100 [zfs]
[ 599.525725] [] zfs_inactive+0x66/0x210 [zfs]
[ 599.525729] [] zpl_vap_init+0x603/0x6c0 [zfs]
[ 599.525732] [] evict+0xa6/0x1b0
[ 599.525734] [] iput+0x103/0x210
[ 599.525735] [] do_unlinkat+0x157/0x1c0
[ 599.525738] [] ? sys_newfstatat+0x2e/0x40
[ 599.525740] [] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 599.525742] [] sys_unlinkat+0x1d/0x40
[ 599.525743] [] system_call_fastpath+0x16/0x1b
[ 719.262968] INFO: task txg_sync:6809 blocked for more than 120 seconds.
[ 719.262973] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 719.262977] txg_sync D ffff88053fcd2400 0 6809 2 0x00000000
[ 719.262984] ffff8805008ef9e0 0000000000000046 ffff880521e82b60 0000000000012400
[ 719.262992] ffff8805008effd8 ffff8805008ee000 0000000000012400 0000000000004000
[ 719.262999] ffff8805008effd8 0000000000012400 ffff88052a208ec0 ffff880521e82b60
[ 719.263006] Call Trace:
[ 719.263016] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 719.263024] [] ? try_to_wake_up+0xce/0x2a0
[ 719.263030] [] schedule+0x24/0x70
[ 719.263040] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 719.263046] [] ? wake_up_bit+0x40/0x40
[ 719.263053] [] __cv_wait+0xe/0x10 [spl]
[ 719.263069] [] zio_wait+0xf3/0x1b0 [zfs]
[ 719.263091] [] dsl_dataset_destroy_sync+0xa7c/0x1080 [zfs]
[ 719.263099] [] ? __cv_timedwait+0xc5/0x1a0 [spl]
[ 719.263115] [] dsl_sync_task_group_sync+0x123/0x3d0 [zfs]
[ 719.263122] [] dsl_pool_sync+0x20b/0x490 [zfs]
[ 719.263131] [] spa_sync+0x3a2/0x10b0 [zfs]
[ 719.263139] [] txg_init+0x4cb/0x9b0 [zfs]
[ 719.263148] [] ? txg_init+0x210/0x9b0 [zfs]
[ 719.263150] [] ? __thread_create+0x310/0x3a0 [spl]
[ 719.263153] [] __thread_create+0x383/0x3a0 [spl]
[ 719.263155] [] ? __thread_create+0x310/0x3a0 [spl]
[ 719.263157] [] kthread+0x96/0xa0
[ 719.263160] [] kernel_thread_helper+0x4/0x10
[ 719.263162] [] ? retint_restore_args+0xe/0xe
[ 719.263164] [] ? kthread_worker_fn+0x140/0x140
[ 719.263166] [] ? gs_change+0xb/0xb
[ 719.263183] INFO: task zfs:10166 blocked for more than 120 seconds.
[ 719.263184] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 719.263185] zfs D ffff88053fc52400 0 10166 10125 0x00000000
[ 719.263187] ffff8804d717fbe8 0000000000000086 ffff880521e824c0 0000000000012400
[ 719.263190] ffff8804d717ffd8 ffff8804d717e000 0000000000012400 0000000000004000
[ 719.263192] ffff8804d717ffd8 0000000000012400 ffff88052a1380c0 ffff880521e824c0
[ 719.263195] Call Trace:
[ 719.263197] [] ? spl_debug_get_subsys+0x51/0x460 [spl]
[ 719.263200] [] ? spl_debug_msg+0x413/0x970 [spl]
[ 719.263201] [] schedule+0x24/0x70
[ 719.263204] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 719.263206] [] ? wake_up_bit+0x40/0x40
[ 719.263208] [] __cv_wait+0xe/0x10 [spl]
[ 719.263216] [] txg_wait_synced+0xb3/0x190 [zfs]
[ 719.263225] [] dsl_sync_task_group_wait+0x109/0x240 [zfs]
[ 719.263232] [] ? dsl_dataset_disown+0x240/0x240 [zfs]
[ 719.263239] [] ? dsl_destroy_inconsistent+0x1c0/0x1c0 [zfs]
[ 719.263247] [] dsl_sync_task_do+0x54/0x80 [zfs]
[ 719.263254] [] dsl_dataset_destroy+0x132/0x490 [zfs]
[ 719.263261] [] ? dsl_dataset_tryown+0x49/0x120 [zfs]
[ 719.263268] [] dmu_objset_destroy+0x36/0x40 [zfs]
[ 719.263274] [] zfs_unmount_snap+0x253/0x480 [zfs]
[ 719.263280] [] pool_status_check+0x196/0x270 [zfs]
[ 719.263283] [] do_vfs_ioctl+0x96/0x500
[ 719.263285] [] ? alloc_fd+0x47/0x140
[ 719.263288] [] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 719.263289] [] sys_ioctl+0x4a/0x80
[ 719.263291] [] system_call_fastpath+0x16/0x1b
[ 719.263293] INFO: task rm:10209 blocked for more than 120 seconds.
[ 719.263294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 719.263295] rm D ffff88053fc52400 0 10209 10168 0x00000000
[ 719.263297] ffff8804dfff5b48 0000000000000086 ffff880515a5b7e0 0000000000012400
[ 719.263300] ffff8804dfff5fd8 ffff8804dfff4000 0000000000012400 0000000000004000
[ 719.263302] ffff8804dfff5fd8 0000000000012400 ffff88052a1380c0 ffff880515a5b7e0
[ 719.263304] Call Trace:
[ 719.263306] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 719.263308] [] ? try_to_wake_up+0xce/0x2a0
[ 719.263310] [] schedule+0x24/0x70
[ 719.263312] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 719.263314] [] ? wake_up_bit+0x40/0x40
[ 719.263316] [] __cv_wait+0xe/0x10 [spl]
[ 719.263324] [] txg_wait_open+0x83/0x110 [zfs]
[ 719.263331] [] dmu_tx_wait+0xed/0xf0 [zfs]
[ 719.263338] [] dmu_tx_assign+0x86/0xe90 [zfs]
[ 719.263343] [] ? dmu_buf_rele+0x2b/0x40 [zfs]
[ 719.263350] [] zfs_rmnode+0x127/0x340 [zfs]
[ 719.263355] [] zfs_zinactive+0x8b/0x100 [zfs]
[ 719.263360] [] zfs_inactive+0x66/0x210 [zfs]
[ 719.263365] [] zpl_vap_init+0x603/0x6c0 [zfs]
[ 719.263367] [] evict+0xa6/0x1b0
[ 719.263369] [] iput+0x103/0x210
[ 719.263371] [] do_unlinkat+0x157/0x1c0
[ 719.263374] [] ? sys_newfstatat+0x2e/0x40
[ 719.263375] [] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 719.263377] [] sys_unlinkat+0x1d/0x40
[ 719.263379] [] system_call_fastpath+0x16/0x1b
[ 839.000555] INFO: task txg_sync:6809 blocked for more than 120 seconds.
[ 839.000559] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 839.000563] txg_sync D ffff88053fcd2400 0 6809 2 0x00000000
[ 839.000570] ffff8805008ef9e0 0000000000000046 ffff880521e82b60 0000000000012400
[ 839.000580] ffff8805008effd8 ffff8805008ee000 0000000000012400 0000000000004000
[ 839.000590] ffff8805008effd8 0000000000012400 ffff88052a208ec0 ffff880521e82b60
[ 839.000598] Call Trace:
[ 839.000609] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 839.000617] [] ? try_to_wake_up+0xce/0x2a0
[ 839.000622] [] schedule+0x24/0x70
[ 839.000633] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 839.000639] [] ? wake_up_bit+0x40/0x40
[ 839.000646] [] __cv_wait+0xe/0x10 [spl]
[ 839.000661] [] zio_wait+0xf3/0x1b0 [zfs]
[ 839.000685] [] dsl_dataset_destroy_sync+0xa7c/0x1080 [zfs]
[ 839.000688] [] ? __cv_timedwait+0xc5/0x1a0 [spl]
[ 839.000696] [] dsl_sync_task_group_sync+0x123/0x3d0 [zfs]
[ 839.000704] [] dsl_pool_sync+0x20b/0x490 [zfs]
[ 839.000713] [] spa_sync+0x3a2/0x10b0 [zfs]
[ 839.000722] [] txg_init+0x4cb/0x9b0 [zfs]
[ 839.000733] [] ? txg_init+0x210/0x9b0 [zfs]
[ 839.000736] [] ? __thread_create+0x310/0x3a0 [spl]
[ 839.000739] [] __thread_create+0x383/0x3a0 [spl]
[ 839.000741] [] ? __thread_create+0x310/0x3a0 [spl]
[ 839.000743] [] kthread+0x96/0xa0
[ 839.000746] [] kernel_thread_helper+0x4/0x10
[ 839.000748] [] ? retint_restore_args+0xe/0xe
[ 839.000749] [] ? kthread_worker_fn+0x140/0x140
[ 839.000751] [] ? gs_change+0xb/0xb
[ 839.000769] INFO: task zfs:10166 blocked for more than 120 seconds.
[ 839.000770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 839.000771] zfs D ffff88053fc52400 0 10166 10125 0x00000000
[ 839.000773] ffff8804d717fbe8 0000000000000086 ffff880521e824c0 0000000000012400
[ 839.000776] ffff8804d717ffd8 ffff8804d717e000 0000000000012400 0000000000004000
[ 839.000778] ffff8804d717ffd8 0000000000012400 ffff88052a1380c0 ffff880521e824c0
[ 839.000780] Call Trace:
[ 839.000783] [] ? spl_debug_get_subsys+0x51/0x460 [spl]
[ 839.000785] [] ? spl_debug_msg+0x413/0x970 [spl]
[ 839.000787] [] schedule+0x24/0x70
[ 839.000790] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 839.000791] [] ? wake_up_bit+0x40/0x40
[ 839.000794] [] __cv_wait+0xe/0x10 [spl]
[ 839.000802] [] txg_wait_synced+0xb3/0x190 [zfs]
[ 839.000810] [] dsl_sync_task_group_wait+0x109/0x240 [zfs]
[ 839.000817] [] ? dsl_dataset_disown+0x240/0x240 [zfs]
[ 839.000825] [] ? dsl_destroy_inconsistent+0x1c0/0x1c0 [zfs]
[ 839.000833] [] dsl_sync_task_do+0x54/0x80 [zfs]
[ 839.000840] [] dsl_dataset_destroy+0x132/0x490 [zfs]
[ 839.000847] [] ? dsl_dataset_tryown+0x49/0x120 [zfs]
[ 839.000853] [] dmu_objset_destroy+0x36/0x40 [zfs]
[ 839.000860] [] zfs_unmount_snap+0x253/0x480 [zfs]
[ 839.000866] [] pool_status_check+0x196/0x270 [zfs]
[ 839.000868] [] do_vfs_ioctl+0x96/0x500
[ 839.000870] [] ? alloc_fd+0x47/0x140
[ 839.000873] [] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 839.000875] [] sys_ioctl+0x4a/0x80
[ 839.000877] [] system_call_fastpath+0x16/0x1b
[ 839.000878] INFO: task rm:10209 blocked for more than 120 seconds.
[ 839.000879] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 839.000880] rm D ffff88053fc52400 0 10209 10168 0x00000000
[ 839.000882] ffff8804dfff5b48 0000000000000086 ffff880515a5b7e0 0000000000012400
[ 839.000885] ffff8804dfff5fd8 ffff8804dfff4000 0000000000012400 0000000000004000
[ 839.000887] ffff8804dfff5fd8 0000000000012400 ffff88052a1380c0 ffff880515a5b7e0
[ 839.000889] Call Trace:
[ 839.000891] [] ? _raw_spin_unlock_irqrestore+0x2f/0x40
[ 839.000893] [] ? try_to_wake_up+0xce/0x2a0
[ 839.000895] [] schedule+0x24/0x70
[ 839.000897] [] __cv_timedwait+0xa7/0x1a0 [spl]
[ 839.000899] [] ? wake_up_bit+0x40/0x40
[ 839.000902] [] __cv_wait+0xe/0x10 [spl]
[ 839.000910] [] txg_wait_open+0x83/0x110 [zfs]
[ 839.000917] [] dmu_tx_wait+0xed/0xf0 [zfs]
[ 839.000923] [] dmu_tx_assign+0x86/0xe90 [zfs]
[ 839.000928] [] ? dmu_buf_rele+0x2b/0x40 [zfs]
[ 839.000935] [] zfs_rmnode+0x127/0x340 [zfs]
[ 839.000940] [] zfs_zinactive+0x8b/0x100 [zfs]
[ 839.000946] [] zfs_inactive+0x66/0x210 [zfs]
[ 839.000950] [] zpl_vap_init+0x603/0x6c0 [zfs]
[ 839.000953] [] evict+0xa6/0x1b0
[ 839.000954] [] iput+0x103/0x210
[ 839.000956] [] do_unlinkat+0x157/0x1c0
[ 839.000959] [] ? sys_newfstatat+0x2e/0x40
[ 839.000961] [] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 839.000962] [] sys_unlinkat+0x1d/0x40
[ 839.000964] [] system_call_fastpath+0x16/0x1b
[ 839.000966] INFO: task zfs:10227 blocked for more than 120 seconds.
[ 839.000967] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 839.000968] zfs D ffff88053fc92400 0 10227 10211 0x00000000
[ 839.000970] ffff8804bbfc7bf0 0000000000000082 ffff8804f66690c0 0000000000012400
[ 839.000972] ffff8804bbfc7fd8 ffff8804bbfc6000 0000000000012400 0000000000004000
[ 839.000975] ffff8804bbfc7fd8 0000000000012400 ffff88052a1f7520 ffff8804f66690c0
[ 839.000977] Call Trace:
[ 839.000979] [] ? vsnprintf+0x356/0x610
[ 839.000981] [] ? spl_debug_get_subsys+0x51/0x460 [spl]
[ 839.000984] [] ? spl_debug_msg+0x413/0x970 [spl]
[ 839.000992] [] ? spa_meta_objset+0x19/0x40 [zfs]
[ 839.000994] [] schedule+0x24/0x70
[ 839.000996] [] rwsem_down_failed_common+0xc5/0x150
[ 839.000998] [] rwsem_down_read_failed+0x15/0x17
[ 839.000999] [] call_rwsem_down_read_failed+0x14/0x30
[ 839.001001] [] ? down_read+0x12/0x20
[ 839.001009] [] dsl_dir_open_spa+0xb3/0x540 [zfs]
[ 839.001017] [] ? spa_meta_objset+0x19/0x40 [zfs]
[ 839.001019] [] ? avl_find+0x5a/0xa0 [zavl]
[ 839.001026] [] dsl_dataset_hold+0x3b/0x290 [zfs]
[ 839.001029] [] ? __kmalloc+0x13e/0x1c0
[ 839.001035] [] dmu_objset_hold+0x1f/0x1a0 [zfs]
[ 839.001042] [] zfs_secpolicy_smb_acl+0x2f84/0x3b20 [zfs]
[ 839.001048] [] ? pool_status_check+0x61/0x270 [zfs]
[ 839.001054] [] pool_status_check+0x196/0x270 [zfs]
[ 839.001056] [] do_vfs_ioctl+0x96/0x500
[ 839.001058] [] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 839.001059] [] sys_ioctl+0x4a/0x80
[ 839.001061] [] system_call_fastpath+0x16/0x1b

@behlendorf
Copy link
Contributor

The stacks are just advisory but they seem to indicate the txg_sync thread was block on an I/O request for quite some time. Perhaps one of your drives had a hiccup.

@ryanolf
Copy link

ryanolf commented Jul 10, 2012

I also frequently see lock-ups (sometimes the system seems to have crashed...) with similar traces in the logs. This happens typically when under a reasonably heavy load, e.g. when resilvering/scrubbing and/or snapshotting while rsyncing. I thought maybe this was related to zfs memory issues, but do you think it could be an issue with my disks or eSATA card? I am finding scrubs fix errors on them from time to time. They are in a TowerRAID eSATA external enclosure driven by a eSATA raid card (RocketRaid 622- but just running them JBOD), so there are many places to go wrong. I'm running the latest stable Ubuntu ppa as of July 10. Our array is an 8TB raidz2 with a 32 GB usb stick cache device. The array is about 75% full with compression and dedup enabled. The system has 8GB RAM. I'd very much appreciate you helping me understand these stack traces so that I can do something on my end to help stop the lockups, and that they help you with your great work on zfs.

Looking at other bug reports, it looks like this implementation of zfs has a variety of issues with memory and excessive disk access while under load. Since scrubbing takes so long on reasonably sized arrays, it seems that the system should be able to do this and handle other IO without locking up. Is there perhaps an underlying issue with prioritizing finite resources? This may seem to go under "optimization" but because zfs systems can easily grow large, some amount of optimization may be required for basic functionality.

Here are my kernel logs from the most recent lockup. The system recovered from this one:

[64198.775024] INFO: task txg_sync:3038 blocked for more than 120 seconds.
[64198.775027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64198.775030] txg_sync        D 0000000000000003     0  3038      2 0x00000000
[64198.775033]  ffff8801edf4bbd0 0000000000000046 ffff8801edf4bb70 ffffffff8105fb02
[64198.775037]  ffff8801edf4bfd8 ffff8801edf4bfd8 ffff8801edf4bfd8 0000000000013780
[64198.775040]  ffff880209dda000 ffff8801f1dd5bc0 ffff8801edf4bbe0 ffff8801cc6231e8
[64198.775044] Call Trace:
[64198.775051]  [<ffffffff8105fb02>] ? default_wake_function+0x12/0x20
[64198.775056]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64198.775070]  [<ffffffffa019dae8>] cv_wait_common+0x98/0x190 [spl]
[64198.775073]  [<ffffffff8108b030>] ? add_wait_queue+0x60/0x60
[64198.775079]  [<ffffffffa019dc13>] __cv_wait+0x13/0x20 [spl]
[64198.775106]  [<ffffffffa02e6a6b>] zio_wait+0xfb/0x170 [zfs]
[64198.775123]  [<ffffffffa028032d>] dsl_pool_sync+0x29d/0x450 [zfs]
[64198.775141]  [<ffffffffa0293e2e>] spa_sync+0x38e/0xa00 [zfs]
[64198.775161]  [<ffffffffa02a3ef6>] txg_sync_thread+0x286/0x450 [zfs]
[64198.775180]  [<ffffffffa02a3c70>] ? txg_init+0x250/0x250 [zfs]
[64198.775185]  [<ffffffffa01970b8>] thread_generic_wrapper+0x78/0x90 [spl]
[64198.775191]  [<ffffffffa0197040>] ? __thread_create+0x310/0x310 [spl]
[64198.775193]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64198.775197]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64198.775200]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64198.775202]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.771919] INFO: task l2arc_feed:1088 blocked for more than 120 seconds.
[64318.771923] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.771925] l2arc_feed      D 0000000000000003     0  1088      2 0x00000000
[64318.771929]  ffff8801fd0d5cf0 0000000000000046 ffff8801fd0d5cd0 ffffffff8104c2e8
[64318.771933]  ffff8801fd0d5fd8 ffff8801fd0d5fd8 ffff8801fd0d5fd8 0000000000013780
[64318.771936]  ffff88017a3f6000 ffff88020067c4d0 ffff8801fd0d5d00 ffff8801fc066e08
[64318.771939] Call Trace:
[64318.771946]  [<ffffffff8104c2e8>] ? __wake_up_common+0x58/0x90
[64318.771954]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.771964]  [<ffffffffa019dae8>] cv_wait_common+0x98/0x190 [spl]
[64318.771968]  [<ffffffff8108b030>] ? add_wait_queue+0x60/0x60
[64318.771974]  [<ffffffffa019dc13>] __cv_wait+0x13/0x20 [spl]
[64318.771998]  [<ffffffffa02e6a6b>] zio_wait+0xfb/0x170 [zfs]
[64318.772009]  [<ffffffffa024e637>] l2arc_feed_thread+0x627/0x850 [zfs]
[64318.772020]  [<ffffffffa024e010>] ? l2arc_evict+0x310/0x310 [zfs]
[64318.772026]  [<ffffffffa01970b8>] thread_generic_wrapper+0x78/0x90 [spl]
[64318.772031]  [<ffffffffa0197040>] ? __thread_create+0x310/0x310 [spl]
[64318.772033]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772037]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772040]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772042]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772058] INFO: task z_null_iss/0:2977 blocked for more than 120 seconds.
[64318.772059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772061] z_null_iss/0    D 0000000000000000     0  2977      2 0x00000000
[64318.772063]  ffff8801f0e01cd0 0000000000000046 ffff880210a65c00 ffff8801b0b5c6c0
[64318.772067]  ffff8801f0e01fd8 ffff8801f0e01fd8 ffff8801f0e01fd8 0000000000013780
[64318.772070]  ffff8801fbb40000 ffff8801f944dbc0 ffffffffa02f7110 ffff8801ff7a9ee0
[64318.772073] Call Trace:
[64318.772076]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772079]  [<ffffffff816588c7>] __mutex_lock_slowpath+0xd7/0x150
[64318.772082]  [<ffffffff816584da>] mutex_lock+0x2a/0x50
[64318.772101]  [<ffffffffa02aebad>] vdev_queue_io+0x6d/0x120 [zfs]
[64318.772119]  [<ffffffffa02e6867>] zio_vdev_io_start+0x1e7/0x2f0 [zfs]
[64318.772136]  [<ffffffffa02e431f>] zio_execute+0x9f/0xf0 [zfs]
[64318.772142]  [<ffffffffa019822b>] taskq_thread+0x24b/0x5a0 [spl]
[64318.772145]  [<ffffffff810562da>] ? finish_task_switch+0x4a/0xf0
[64318.772148]  [<ffffffff8105faf0>] ? try_to_wake_up+0x200/0x200
[64318.772153]  [<ffffffffa0197fe0>] ? task_done+0x140/0x140 [spl]
[64318.772156]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772158]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772161]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772163]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772169] INFO: task txg_sync:3038 blocked for more than 120 seconds.
[64318.772171] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772172] txg_sync        D 0000000000000003     0  3038      2 0x00000000
[64318.772175]  ffff8801edf4bbd0 0000000000000046 ffff8801edf4bb70 ffffffff8105fb02
[64318.772178]  ffff8801edf4bfd8 ffff8801edf4bfd8 ffff8801edf4bfd8 0000000000013780
[64318.772181]  ffff880209dda000 ffff8801f1dd5bc0 ffff8801edf4bbe0 ffff8801cc6231e8
[64318.772184] Call Trace:
[64318.772187]  [<ffffffff8105fb02>] ? default_wake_function+0x12/0x20
[64318.772189]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772195]  [<ffffffffa019dae8>] cv_wait_common+0x98/0x190 [spl]
[64318.772198]  [<ffffffff8108b030>] ? add_wait_queue+0x60/0x60
[64318.772203]  [<ffffffffa019dc13>] __cv_wait+0x13/0x20 [spl]
[64318.772220]  [<ffffffffa02e6a6b>] zio_wait+0xfb/0x170 [zfs]
[64318.772237]  [<ffffffffa028032d>] dsl_pool_sync+0x29d/0x450 [zfs]
[64318.772255]  [<ffffffffa0293e2e>] spa_sync+0x38e/0xa00 [zfs]
[64318.772274]  [<ffffffffa02a3ef6>] txg_sync_thread+0x286/0x450 [zfs]
[64318.772292]  [<ffffffffa02a3c70>] ? txg_init+0x250/0x250 [zfs]
[64318.772297]  [<ffffffffa01970b8>] thread_generic_wrapper+0x78/0x90 [spl]
[64318.772302]  [<ffffffffa0197040>] ? __thread_create+0x310/0x310 [spl]
[64318.772304]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772307]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772310]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772312]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772328] INFO: task kworker/2:3:5687 blocked for more than 120 seconds.
[64318.772329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772331] kworker/2:3     D 0000000000000002     0  5687      2 0x00000000
[64318.772334]  ffff8801ed3e5c50 0000000000000046 ffff8801ed3e5c10 0000000000000000
[64318.772337]  ffff8801ed3e5fd8 ffff8801ed3e5fd8 ffff8801ed3e5fd8 0000000000013780
[64318.772340]  ffff8800a8e5c4d0 ffff88006b6244d0 ffff8800bf574800 ffff8801f01544b0
[64318.772343] Call Trace:
[64318.772346]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772349]  [<ffffffff816588c7>] __mutex_lock_slowpath+0xd7/0x150
[64318.772352]  [<ffffffff816584da>] mutex_lock+0x2a/0x50
[64318.772369]  [<ffffffffa02bdaa8>] zfsctl_unmount_snapshot+0x58/0xf0 [zfs]
[64318.772372]  [<ffffffff81057600>] ? dequeue_entity+0x120/0x200
[64318.772375]  [<ffffffff81311bad>] ? __rb_erase_color+0x11d/0x1c0
[64318.772379]  [<ffffffff810126e5>] ? __switch_to+0xf5/0x360
[64318.772381]  [<ffffffff8105613d>] ? set_next_entity+0xad/0xd0
[64318.772398]  [<ffffffffa02bdb40>] ? zfsctl_unmount_snapshot+0xf0/0xf0 [zfs]
[64318.772415]  [<ffffffffa02bdb6e>] zfsctl_expire_snapshot+0x2e/0x60 [zfs]
[64318.772418]  [<ffffffff81084f9a>] process_one_work+0x11a/0x480
[64318.772421]  [<ffffffff81085d44>] worker_thread+0x164/0x370
[64318.772424]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.772427]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772429]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772432]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772434]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772436] INFO: task kworker/2:4:5695 blocked for more than 120 seconds.
[64318.772438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772439] kworker/2:4     D 0000000000000002     0  5695      2 0x00000000
[64318.772442]  ffff88015380bb90 0000000000000046 ffff880215802900 0000000000000000
[64318.772445]  ffff88015380bfd8 ffff88015380bfd8 ffff88015380bfd8 0000000000013780
[64318.772448]  ffff88020030dbc0 ffff8800a8e5c4d0 ffff8801f534f458 7fffffffffffffff
[64318.772451] Call Trace:
[64318.772454]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772457]  [<ffffffff81658105>] schedule_timeout+0x2a5/0x320
[64318.772459]  [<ffffffff81053a6c>] ? select_idle_sibling+0xcc/0x230
[64318.772462]  [<ffffffff816578ff>] wait_for_common+0xdf/0x180
[64318.772465]  [<ffffffff8105faf0>] ? try_to_wake_up+0x200/0x200
[64318.772468]  [<ffffffff81657a7d>] wait_for_completion+0x1d/0x20
[64318.772471]  [<ffffffff8108a76f>] kthread_create_on_node+0xaf/0x130
[64318.772473]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.772476]  [<ffffffff81083874>] ? alloc_worker+0x24/0x70
[64318.772479]  [<ffffffff81085936>] create_worker+0x136/0x1b0
[64318.772482]  [<ffffffff81085a15>] maybe_create_worker+0x65/0x100
[64318.772484]  [<ffffffff81085af8>] manage_workers.isra.29+0x48/0x130
[64318.772487]  [<ffffffff81085ecb>] worker_thread+0x2eb/0x370
[64318.772489]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.772492]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772494]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772497]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772499]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772501] INFO: task kworker/0:1:13440 blocked for more than 120 seconds.
[64318.772502] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772504] kworker/0:1     D 0000000000000000     0 13440      2 0x00000000
[64318.772507]  ffff880132483ad0 0000000000000046 ffffffff8111a0a5 0000000000011220
[64318.772510]  ffff880132483fd8 ffff880132483fd8 ffff880132483fd8 0000000000013780
[64318.772513]  ffff8800a8e5ade0 ffff880073d2c4d0 ffff880073d2c4d0 7fffffffffffffff
[64318.772516] Call Trace:
[64318.772520]  [<ffffffff8111a0a5>] ? mempool_alloc_slab+0x15/0x20
[64318.772523]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772525]  [<ffffffff81658105>] schedule_timeout+0x2a5/0x320
[64318.772529]  [<ffffffff8103dcf9>] ? default_spin_lock_flags+0x9/0x10
[64318.772531]  [<ffffffff8105fa7e>] ? try_to_wake_up+0x18e/0x200
[64318.772534]  [<ffffffff816578ff>] wait_for_common+0xdf/0x180
[64318.772537]  [<ffffffff8105faf0>] ? try_to_wake_up+0x200/0x200
[64318.772540]  [<ffffffff81657a7d>] wait_for_completion+0x1d/0x20
[64318.772542]  [<ffffffff81081939>] call_usermodehelper_exec+0x109/0x120
[64318.772559]  [<ffffffffa02bcf7d>] __zfsctl_unmount_snapshot+0xad/0x120 [zfs]
[64318.772576]  [<ffffffffa02bdade>] zfsctl_unmount_snapshot+0x8e/0xf0 [zfs]
[64318.772578]  [<ffffffff81057600>] ? dequeue_entity+0x120/0x200
[64318.772581]  [<ffffffff81311bad>] ? __rb_erase_color+0x11d/0x1c0
[64318.772584]  [<ffffffff810126e5>] ? __switch_to+0xf5/0x360
[64318.772586]  [<ffffffff8105613d>] ? set_next_entity+0xad/0xd0
[64318.772601]  [<ffffffffa02bdb40>] ? zfsctl_unmount_snapshot+0xf0/0xf0 [zfs]
[64318.772616]  [<ffffffffa02bdb6e>] zfsctl_expire_snapshot+0x2e/0x60 [zfs]
[64318.772619]  [<ffffffff81084f9a>] process_one_work+0x11a/0x480
[64318.772622]  [<ffffffff81085d44>] worker_thread+0x164/0x370
[64318.772625]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.772627]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.772630]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.772633]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.772635]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.772637] INFO: task rsync:13631 blocked for more than 120 seconds.
[64318.772639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.772640] rsync           D 0000000000000002     0 13631  13630 0x00000000
[64318.772643]  ffff8801eeea56a8 0000000000000082 ffff88015899e540 ffff8801ffa30908
[64318.772646]  ffff8801eeea5fd8 ffff8801eeea5fd8 ffff8801eeea5fd8 0000000000013780
[64318.772649]  ffff88020067ade0 ffff8800a8e58000 ffff8801eeea56b8 ffff8801c295aa28
[64318.772653] Call Trace:
[64318.772655]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.772661]  [<ffffffffa019dae8>] cv_wait_common+0x98/0x190 [spl]
[64318.772663]  [<ffffffff8108b030>] ? add_wait_queue+0x60/0x60
[64318.772669]  [<ffffffffa019dc13>] __cv_wait+0x13/0x20 [spl]
[64318.772686]  [<ffffffffa02e6a6b>] zio_wait+0xfb/0x170 [zfs]
[64318.772698]  [<ffffffffa0256797>] dbuf_read+0x337/0x7b0 [zfs]
[64318.772711]  [<ffffffffa025d490>] dmu_buf_hold+0xf0/0x1a0 [zfs]
[64318.772728]  [<ffffffffa02b1bb9>] zap_idx_to_blk+0xa9/0x140 [zfs]
[64318.772741]  [<ffffffffa025c3c0>] ? ddt_zap_update+0xb0/0xb0 [zfs]
[64318.772758]  [<ffffffffa02b33ea>] fzap_prefetch+0x3a/0x70 [zfs]
[64318.772775]  [<ffffffffa02b84e6>] zap_prefetch_uint64+0x76/0xa0 [zfs]
[64318.772788]  [<ffffffffa025c3d3>] ddt_zap_prefetch+0x13/0x20 [zfs]
[64318.772800]  [<ffffffffa025b14c>] ddt_prefetch+0xbc/0xc0 [zfs]
[64318.772812]  [<ffffffffa02569f3>] ? dbuf_read+0x593/0x7b0 [zfs]
[64318.772817]  [<ffffffffa01939fb>] ? kmem_free_debug+0x4b/0x150 [spl]
[64318.772829]  [<ffffffffa0257919>] ? dbuf_hold_impl+0x99/0xc0 [zfs]
[64318.772840]  [<ffffffffa0256d69>] ? dbuf_rele_and_unlock+0x159/0x200 [zfs]
[64318.772843]  [<ffffffff816584cd>] ? mutex_lock+0x1d/0x50
[64318.772853]  [<ffffffffa024d076>] ? add_reference.isra.9+0xe6/0x120 [zfs]
[64318.772857]  [<ffffffff816584cd>] ? mutex_lock+0x1d/0x50
[64318.772868]  [<ffffffffa02569f3>] ? dbuf_read+0x593/0x7b0 [zfs]
[64318.772871]  [<ffffffff816584cd>] ? mutex_lock+0x1d/0x50
[64318.772874]  [<ffffffff816584cd>] ? mutex_lock+0x1d/0x50
[64318.772891]  [<ffffffffa029fa7d>] ? spa_config_exit+0x7d/0xc0 [zfs]
[64318.772906]  [<ffffffffa0272cb0>] dsl_dataset_block_freeable+0x40/0x60 [zfs]
[64318.772919]  [<ffffffffa026a4f4>] dmu_tx_hold_free+0x464/0x7c0 [zfs]
[64318.772932]  [<ffffffffa025ce37>] dmu_free_long_range_impl+0x177/0x270 [zfs]
[64318.772945]  [<ffffffffa025ec1b>] dmu_free_long_range+0x4b/0x70 [zfs]
[64318.772962]  [<ffffffffa02bf0e0>] zfs_rmnode+0x60/0x340 [zfs]
[64318.772980]  [<ffffffffa02de8f6>] zfs_zinactive+0x86/0xd0 [zfs]
[64318.772997]  [<ffffffffa02d998b>] zfs_inactive+0x6b/0x1e0 [zfs]
[64318.773000]  [<ffffffff811256be>] ? truncate_pagecache+0x5e/0x70
[64318.773017]  [<ffffffffa02ecba8>] zpl_evict_inode+0x28/0x30 [zfs]
[64318.773021]  [<ffffffff811927b8>] evict+0xb8/0x1c0
[64318.773023]  [<ffffffff811929a8>] iput_final+0xe8/0x210
[64318.773025]  [<ffffffff81192b0e>] iput+0x3e/0x50
[64318.773028]  [<ffffffff81187533>] do_unlinkat+0x153/0x1d0
[64318.773031]  [<ffffffff81177fc0>] ? vfs_write+0x110/0x180
[64318.773033]  [<ffffffff8117828a>] ? sys_write+0x4a/0x90
[64318.773036]  [<ffffffff81188106>] sys_unlink+0x16/0x20
[64318.773038]  [<ffffffff81661fc2>] system_call_fastpath+0x16/0x1b
[64318.773040] INFO: task kworker/0:0:13635 blocked for more than 120 seconds.
[64318.773042] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.773043] kworker/0:0     D 0000000000000000     0 13635      2 0x00000000
[64318.773046]  ffff8801e8d97c50 0000000000000046 ffff8801e8d97c10 0000000000000000
[64318.773049]  ffff8801e8d97fd8 ffff8801e8d97fd8 ffff8801e8d97fd8 0000000000013780
[64318.773053]  ffff8801252fade0 ffff8800a8e5ade0 ffff8801a6060200 ffff8801f01544b0
[64318.773056] Call Trace:
[64318.773058]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.773061]  [<ffffffff816588c7>] __mutex_lock_slowpath+0xd7/0x150
[64318.773064]  [<ffffffff816584da>] mutex_lock+0x2a/0x50
[64318.773081]  [<ffffffffa02bdaa8>] zfsctl_unmount_snapshot+0x58/0xf0 [zfs]
[64318.773083]  [<ffffffff81057600>] ? dequeue_entity+0x120/0x200
[64318.773086]  [<ffffffff810126e5>] ? __switch_to+0xf5/0x360
[64318.773088]  [<ffffffff8105613d>] ? set_next_entity+0xad/0xd0
[64318.773104]  [<ffffffffa02bdb40>] ? zfsctl_unmount_snapshot+0xf0/0xf0 [zfs]
[64318.773119]  [<ffffffffa02bdb6e>] zfsctl_expire_snapshot+0x2e/0x60 [zfs]
[64318.773121]  [<ffffffff81084f9a>] process_one_work+0x11a/0x480
[64318.773124]  [<ffffffff81085d44>] worker_thread+0x164/0x370
[64318.773127]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.773129]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.773132]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.773134]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.773137]  [<ffffffff81664130>] ? gs_change+0x13/0x13
[64318.773139] INFO: task kworker/0:2:20829 blocked for more than 120 seconds.
[64318.773140] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[64318.773142] kworker/0:2     D 0000000000000000     0 20829      2 0x00000000
[64318.773144]  ffff88015ab1fb90 0000000000000046 ffff880034abd870 000000000bd28000
[64318.773148]  ffff88015ab1ffd8 ffff88015ab1ffd8 ffff88015ab1ffd8 0000000000013780
[64318.773151]  ffff880201fdc4d0 ffff8801252fade0 ffff8801a0094b58 7fffffffffffffff
[64318.773154] Call Trace:
[64318.773156]  [<ffffffff81657abf>] schedule+0x3f/0x60
[64318.773159]  [<ffffffff81658105>] schedule_timeout+0x2a5/0x320
[64318.773161]  [<ffffffff81053a6c>] ? select_idle_sibling+0xcc/0x230
[64318.773164]  [<ffffffff816578ff>] wait_for_common+0xdf/0x180
[64318.773167]  [<ffffffff8105faf0>] ? try_to_wake_up+0x200/0x200
[64318.773170]  [<ffffffff81657a7d>] wait_for_completion+0x1d/0x20
[64318.773172]  [<ffffffff8108a76f>] kthread_create_on_node+0xaf/0x130
[64318.773175]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.773178]  [<ffffffff81083874>] ? alloc_worker+0x24/0x70
[64318.773180]  [<ffffffff81085936>] create_worker+0x136/0x1b0
[64318.773183]  [<ffffffff81085a15>] maybe_create_worker+0x65/0x100
[64318.773186]  [<ffffffff81085af8>] manage_workers.isra.29+0x48/0x130
[64318.773189]  [<ffffffff81085ecb>] worker_thread+0x2eb/0x370
[64318.773191]  [<ffffffff81085be0>] ? manage_workers.isra.29+0x130/0x130
[64318.773194]  [<ffffffff8108a59c>] kthread+0x8c/0xa0
[64318.773196]  [<ffffffff81664134>] kernel_thread_helper+0x4/0x10
[64318.773199]  [<ffffffff8108a510>] ? flush_kthread_worker+0xa0/0xa0
[64318.773201]  [<ffffffff81664130>] ? gs_change+0x13/0x13

@ryao
Copy link
Contributor

ryao commented Oct 11, 2012

@mooinglemur Is this still an issue in 0.6.0-rc11?

@dswartz
Copy link
Contributor

dswartz commented Oct 18, 2012

I just had this happen this morning. Running the most recent zfsonlinux on ubuntu12. It completely b0rked my ESXi host which had a datastore mounted on the zfsonlinux VM. I had to reboot the whole host from the previous SAN (openindiana [a solaris variant]), since my wife uses a virtualized windows XP for telecommuting, and the WAF took a big hit due to this. I was migrating datasets between disks when this happened.

@elan
Copy link

elan commented Nov 3, 2012

3.2.0-32-generic #51-Ubuntu SMP, zfs-0.6.0-rc11

I just saw this today, seemed to make an AFP copy hang for a while, but no ill effects besides that observed:

Nov  1 21:06:26 whopper kernel: [30451.753316] umount          D ffffffff81806200     0 21237  21236 0x00000000
Nov  1 21:06:26 whopper kernel: [30451.753322]  ffff880219145d28 0000000000000082 ffff880219145cd8 ffffffffa012d069
Nov  1 21:06:26 whopper kernel: [30451.753327]  ffff880219145fd8 ffff880219145fd8 ffff880219145fd8 00000000000137c0
Nov  1 21:06:26 whopper kernel: [30451.753332]  ffff8804155e9700 ffff880151510000 ffff880219145d38 ffff8804150c8500
Nov  1 21:06:26 whopper kernel: [30451.753337] Call Trace:
Nov  1 21:06:26 whopper kernel: [30451.753368]  [<ffffffffa012d069>] ? dbuf_rele_and_unlock+0x159/0x200 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753375]  [<ffffffff81658f2f>] schedule+0x3f/0x60
Nov  1 21:06:26 whopper kernel: [30451.753386]  [<ffffffffa006bead>] __taskq_wait_id+0x7d/0x150 [spl]
Nov  1 21:06:26 whopper kernel: [30451.753391]  [<ffffffff811b76b4>] ? fsnotify_clear_marks_by_inode+0x94/0xf0
Nov  1 21:06:26 whopper kernel: [30451.753397]  [<ffffffff8108ab80>] ? add_wait_queue+0x60/0x60
Nov  1 21:06:26 whopper kernel: [30451.753405]  [<ffffffffa006cff3>] __taskq_wait+0x53/0xf0 [spl]
Nov  1 21:06:26 whopper kernel: [30451.753436]  [<ffffffffa0166679>] ? rrw_enter+0x69/0x190 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753470]  [<ffffffffa01abbfc>] zfs_sb_teardown+0x5c/0x3a0 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753475]  [<ffffffff81193033>] ? dispose_list+0x23/0x50
Nov  1 21:06:26 whopper kernel: [30451.753505]  [<ffffffffa01abfb0>] zfs_umount+0x30/0xc0 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753535]  [<ffffffffa01c710e>] zpl_put_super+0xe/0x10 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753540]  [<ffffffff8117a0e2>] generic_shutdown_super+0x62/0xe0
Nov  1 21:06:26 whopper kernel: [30451.753544]  [<ffffffff8117a1f6>] kill_anon_super+0x16/0x30
Nov  1 21:06:26 whopper kernel: [30451.753572]  [<ffffffffa01c6f0e>] zpl_kill_sb+0x1e/0x30 [zfs]
Nov  1 21:06:26 whopper kernel: [30451.753576]  [<ffffffff8117a83c>] deactivate_locked_super+0x3c/0xa0
Nov  1 21:06:26 whopper kernel: [30451.753580]  [<ffffffff8117b0be>] deactivate_super+0x4e/0x70
Nov  1 21:06:26 whopper kernel: [30451.753584]  [<ffffffff8119773d>] mntput_no_expire+0x9d/0xf0
Nov  1 21:06:26 whopper kernel: [30451.753588]  [<ffffffff81198a6b>] sys_umount+0x5b/0xd0
Nov  1 21:06:26 whopper kernel: [30451.753592]  [<ffffffff81663442>] system_call_fastpath+0x16/0x1b
Nov  1 21:08:26 whopper kernel: [30571.636117] INFO: task umount:21237 blocked for more than 120 seconds.
Nov  1 21:08:26 whopper kernel: [30571.636490] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  1 21:08:26 whopper kernel: [30571.657955] umount          D ffffffff81806200     0 21237  21236 0x00000000
Nov  1 21:08:26 whopper kernel: [30571.657965]  ffff880219145d28 0000000000000082 ffff880219145cd8 ffffffffa012d069
Nov  1 21:08:26 whopper kernel: [30571.657978]  ffff880219145fd8 ffff880219145fd8 ffff880219145fd8 00000000000137c0
Nov  1 21:08:26 whopper kernel: [30571.657990]  ffff8804155e9700 ffff880151510000 ffff880219145d38 ffff8804150c8500
Nov  1 21:08:26 whopper kernel: [30571.658003] Call Trace:
Nov  1 21:08:26 whopper kernel: [30571.658044]  [<ffffffffa012d069>] ? dbuf_rele_and_unlock+0x159/0x200 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658058]  [<ffffffff81658f2f>] schedule+0x3f/0x60
Nov  1 21:08:26 whopper kernel: [30571.658075]  [<ffffffffa006bead>] __taskq_wait_id+0x7d/0x150 [spl]
Nov  1 21:08:26 whopper kernel: [30571.658083]  [<ffffffff811b76b4>] ? fsnotify_clear_marks_by_inode+0x94/0xf0
Nov  1 21:08:26 whopper kernel: [30571.658091]  [<ffffffff8108ab80>] ? add_wait_queue+0x60/0x60
Nov  1 21:08:26 whopper kernel: [30571.658102]  [<ffffffffa006cff3>] __taskq_wait+0x53/0xf0 [spl]
Nov  1 21:08:26 whopper kernel: [30571.658137]  [<ffffffffa0166679>] ? rrw_enter+0x69/0x190 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658174]  [<ffffffffa01abbfc>] zfs_sb_teardown+0x5c/0x3a0 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658181]  [<ffffffff81193033>] ? dispose_list+0x23/0x50
Nov  1 21:08:26 whopper kernel: [30571.658214]  [<ffffffffa01abfb0>] zfs_umount+0x30/0xc0 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658247]  [<ffffffffa01c710e>] zpl_put_super+0xe/0x10 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658254]  [<ffffffff8117a0e2>] generic_shutdown_super+0x62/0xe0
Nov  1 21:08:26 whopper kernel: [30571.658261]  [<ffffffff8117a1f6>] kill_anon_super+0x16/0x30
Nov  1 21:08:26 whopper kernel: [30571.658292]  [<ffffffffa01c6f0e>] zpl_kill_sb+0x1e/0x30 [zfs]
Nov  1 21:08:26 whopper kernel: [30571.658299]  [<ffffffff8117a83c>] deactivate_locked_super+0x3c/0xa0
Nov  1 21:08:26 whopper kernel: [30571.658305]  [<ffffffff8117b0be>] deactivate_super+0x4e/0x70
Nov  1 21:08:26 whopper kernel: [30571.658312]  [<ffffffff8119773d>] mntput_no_expire+0x9d/0xf0
Nov  1 21:08:26 whopper kernel: [30571.658319]  [<ffffffff81198a6b>] sys_umount+0x5b/0xd0
Nov  1 21:08:26 whopper kernel: [30571.658325]  [<ffffffff81663442>] system_call_fastpath+0x16/0x1b
Nov  1 21:10:26 whopper kernel: [30691.539760] INFO: task umount:21237 blocked for more than 120 seconds.
Nov  1 21:10:26 whopper kernel: [30691.551243] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  1 21:10:26 whopper kernel: [30691.573174] umount          D ffffffff81806200     0 21237  21236 0x00000000
Nov  1 21:10:26 whopper kernel: [30691.573180]  ffff880219145d28 0000000000000082 ffff880219145cd8 ffffffffa012d069
Nov  1 21:10:26 whopper kernel: [30691.573186]  ffff880219145fd8 ffff880219145fd8 ffff880219145fd8 00000000000137c0
Nov  1 21:10:26 whopper kernel: [30691.573190]  ffff8804155e9700 ffff880151510000 ffff880219145d38 ffff8804150c8500
Nov  1 21:10:26 whopper kernel: [30691.573195] Call Trace:
Nov  1 21:10:26 whopper kernel: [30691.573228]  [<ffffffffa012d069>] ? dbuf_rele_and_unlock+0x159/0x200 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573239]  [<ffffffff81658f2f>] schedule+0x3f/0x60
Nov  1 21:10:26 whopper kernel: [30691.573253]  [<ffffffffa006bead>] __taskq_wait_id+0x7d/0x150 [spl]
Nov  1 21:10:26 whopper kernel: [30691.573260]  [<ffffffff811b76b4>] ? fsnotify_clear_marks_by_inode+0x94/0xf0
Nov  1 21:10:26 whopper kernel: [30691.573269]  [<ffffffff8108ab80>] ? add_wait_queue+0x60/0x60
Nov  1 21:10:26 whopper kernel: [30691.573280]  [<ffffffffa006cff3>] __taskq_wait+0x53/0xf0 [spl]
Nov  1 21:10:26 whopper kernel: [30691.573314]  [<ffffffffa0166679>] ? rrw_enter+0x69/0x190 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573350]  [<ffffffffa01abbfc>] zfs_sb_teardown+0x5c/0x3a0 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573357]  [<ffffffff81193033>] ? dispose_list+0x23/0x50
Nov  1 21:10:26 whopper kernel: [30691.573390]  [<ffffffffa01abfb0>] zfs_umount+0x30/0xc0 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573424]  [<ffffffffa01c710e>] zpl_put_super+0xe/0x10 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573431]  [<ffffffff8117a0e2>] generic_shutdown_super+0x62/0xe0
Nov  1 21:10:26 whopper kernel: [30691.573438]  [<ffffffff8117a1f6>] kill_anon_super+0x16/0x30
Nov  1 21:10:26 whopper kernel: [30691.573469]  [<ffffffffa01c6f0e>] zpl_kill_sb+0x1e/0x30 [zfs]
Nov  1 21:10:26 whopper kernel: [30691.573476]  [<ffffffff8117a83c>] deactivate_locked_super+0x3c/0xa0
Nov  1 21:10:26 whopper kernel: [30691.573482]  [<ffffffff8117b0be>] deactivate_super+0x4e/0x70
Nov  1 21:10:26 whopper kernel: [30691.573489]  [<ffffffff8119773d>] mntput_no_expire+0x9d/0xf0
Nov  1 21:10:26 whopper kernel: [30691.573496]  [<ffffffff81198a6b>] sys_umount+0x5b/0xd0
Nov  1 21:10:26 whopper kernel: [30691.573503]  [<ffffffff81663442>] system_call_fastpath+0x16/0x1b

@behlendorf
Copy link
Contributor

It looks like there are a couple unrelated issues mentioned here.

The original problem of zio_wait() being blocked for 120 seconds has been addressed in rc12. That message just indicates a particularly slow IO to the devices, not that anything has done wrong. As such it's been converted in to a zfs event and the console message suppressed to avoid confusion.

The latter issues look slightly different but there's not a ton of data to go on. So I'm going to close this issue and we can open a new one if these issues persist with -rc12 and latter.

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
…penzfs#796)

= Description

This commit allows us to add an object store bucket as a vdev in an
existing pooll and it is the first part of the DOSE Migration project.

= Note: Forcing Addition From `zpool add`

Attempting to add an object-store vdev without `-f` yields the following
error message:
```
$ sudo zpool add -o object-endpoint=etc.. testpool s3 cloudburst-data-2
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses disk and new vdev is objectstore
```

This is done on purpose for now. Adding an objects-store vdev to a pool
is an irreversible operation and should be handled with caution.

= Note: Syncing Labels & The Uberblock

When starting from a block-based pool and we add an object-store vdev
there is a point where we have an object-store vdev in our config but
that vdev is not accepting allocations and therefore we can sync the
config to it. That point is the exact TXG where the vdev is added and
we need to sync its config changes to the labels & uberblock of our
block-based vdevs.

For this reason, I adjusted all the codepaths under `vdev_config_sync()`
to be able to handle the update of the labels and uberblock of the local
devices even when there is an object-store vdev. This way, if the next
TXG fails we have the new vdev somewhere on our config. For all TXGs
from that point on, we always sync the object store's config first. This
is also the config that we always look at first when opening the pool.

With the above changes in `vdev_config_sync()` changes the behavior of
existing pure (e.g. non-hybrid) object-store pools to occasionally
update the labels of their slog devices (e.g. every time we dirty the
pool's config). This should not really have any negative effect in
existing pure object-store pools. On the contrary it should keep
their labels up to date and potentially fix any extreme corner cases
in pool import.

= Note: ZIL allocations

When the pool is in a hybrid state (e.g. backed by both an object store
and block devices) with no slog devices, we could make zil allocations
fall back to the embedded slog or normal class. I left that
functionality as future work.  This is not a prerequisite for DOSE
migration as customers are expected to add zettacache(+ slog) devices as
the first part of migration and therefore their VMs will always have at
least one slog device when the object store vdev is added.

= Note: Storage Pool Checkpoint

Pure object-based pools (e.g. not hybrid ones) do the checkpoint
rewinding process in the object agent. This is a different mechanism
from the storage pool checkpoint in block-based pools. Until we have
the need to make those two mechanism work well with each other we
avoid any migrations to the object store while a zpool checkpoint is
in effect. See `spa_ld_checkpoint_rewind()` usage in
`spa_load_impl()` for more info.

= Note: Ordering of import paths

To import a hybrid pool we need to specify two import paths: (1)
the path of the local block devices (e.g. `/dev/..etc`) and (2)
the name of the bucket in the object store. Unfortunately given
how `zpool_find_import_agent()` is implemented, importing hybrid
pools only works if we specify (2) first and then (1) but not the
opposite. Doing the opposite results in the zpool command hanging
(again this is because of the current XXX in the aforementioned
function).

= Note: Testing Lossing Power Mid-Addition of the Object Store vdev

I tested that by manually introducing panics like so:
```
diff --git a/module/zfs/spa.c b/module/zfs/spa.c
index 5b55bb275..a82fab841 100644
--- a/module/zfs/spa.c
+++ b/module/zfs/spa.c
@@ -6969,7 +6969,9 @@ spa_vdev_add(spa_t *spa, nvlist_t *nvroot)
         * if we lose power at any point in this sequence, ...
         * steps will be completed the next time we load ...
         */
+       ASSERT(B_FALSE); // <--- panic before config sync
        (void) spa_vdev_exit(spa, vd, txg, 0);
+       ASSERT(B_FALSE); // <--- panic before config sync

        mutex_enter(&spa_namespace_lock);
        spa_config_update(spa, SPA_CONFIG_UPDATE_POOL);
```

Importing the pool after the first panic we come back without
the object store vdev as expected. Importing the pool after the
second panic, we come back with the object store but don't
allocate from it nor change the spa_pool_type until we create
its metaslabs on disk.

= Next Steps

I'm planning to implement the augmented device removal logic which
evacuates data from the block devices to the object-store. Once that is
done, I'm planning to work on avoiding the insertion of all the
evacuated data in the zettacache.

Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants