Process stuck at zfs_fsync #10440

sebu06 · 2020-06-11T16:36:23Z

System information

Type	Version/Name
Distribution Name	Debian
Distribution Version	Buster
Linux Kernel	5.4.41-1-pve
Architecture	x86_64
ZFS Version	0.8.4-pve1
SPL Version	0.8.4-pve1

I'm using the tool backy2 to backup a CEPH cluster to a raidz ZFS volume. The Backy2 Process uses a SQLite Database stored on in a dataset on the Backup pool. Sometimes during the Backup, the process gets stuck (D state). The stacktrace shows that it is stuck at a zfs-task:

root@zds:/proc/2958595$ cat stack
[<0>] cv_wait_common+0x104/0x130 [spl]
[<0>] __cv_wait+0x15/0x20 [spl]
[<0>] zil_commit_impl+0x241/0xd90 [zfs]
[<0>] zil_commit+0x3d/0x60 [zfs]
[<0>] zfs_fsync+0x77/0xe0 [zfs]
[<0>] zpl_fsync+0x68/0xa0 [zfs]
[<0>] vfs_fsync_range+0x48/0x80
[<0>] do_fsync+0x3d/0x70
[<0>] __x64_sys_fdatasync+0x17/0x20
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem happens not every day, but on a regular basis. It is quite annoying, because it requires a reboot of the cluster node. The cluster is a Proxmox6 cluster. The same setup was working for a long time on a Proxmox5 cluster (Kernel 4.15.18-29-pve, zfs 0.7.13-pve1~bpo2, Debian Stretch) and problems only started after the update.

The text was updated successfully, but these errors were encountered:

sebu06 · 2020-06-11T17:23:34Z

by the way - there is a process stuck at the moment, which I did not kill. So if any additional data can be obtained from that I'm happy to help

sebu06 · 2020-06-11T17:29:54Z

thanks, I'll try recompiling with that fix and let you know if it occurs again

behlendorf · 2020-06-11T17:30:02Z

This looks very similar to openzfsonosx/openzfs#3. There's a tiny proposed fix in openzfsonosx/openzfs@adeb452. This code was added in the 0.8.x releases, so that's consistent with your observation that it didn't happen with the earlier release.

snajpa · 2020-08-21T15:44:34Z

This bug is still relevant for me with current zfs master (as of today);

lockdep reveals this, it is probably relevant, I'm investigating further:

[  140.025805] ======================================================
[  140.025806] WARNING: possible circular locking dependency detected
[  140.025808] 5.9.0-rc1+ #51 Tainted: G           O     
[  140.025810] ------------------------------------------------------
[  140.025811] systemd-journal/3974 is trying to acquire lock:
[  140.025813] ffff94ba81261070 (&zilog->zl_lock){+.+.}-{3:3}, at: zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.025970] 
               but task is already holding lock:
[  140.025971] ffff94ba9ba504f8 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_impl+0x52b/0x1850 [zfs]
[  140.026100] 
               which lock already depends on the new lock.

[  140.026100] 
               the existing dependency chain (in reverse order) is:
[  140.026101] 
               -> #1 (&zcw->zcw_lock){+.+.}-{3:3}:
[  140.026110]        __mutex_lock+0xac/0x9e0
[  140.026113]        mutex_lock_nested+0x1b/0x20
[  140.026171]        zil_commit_waiter_link_lwb+0x51/0x1a0 [zfs]
[  140.026271]        zil_commit_impl+0x12b0/0x1850 [zfs]
[  140.026366]        zil_commit+0x43/0x60 [zfs]
[  140.026478]        zpl_writepages+0xf8/0x1a0 [zfs]
[  140.026540]        do_writepages+0x43/0xf0
[  140.026544]        __filemap_fdatawrite_range+0xd5/0x110
[  140.026546]        filemap_write_and_wait_range+0x4b/0xb0
[  140.026601]        zpl_fsync+0x4d/0xb0 [zfs]
[  140.026648]        vfs_fsync_range+0x49/0x80
[  140.026650]        do_fsync+0x3d/0x70
[  140.026653]        __x64_sys_fsync+0x14/0x20
[  140.026658]        do_syscall_64+0x38/0x90
[  140.026662]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  140.026663] 
               -> #0 (&zilog->zl_lock){+.+.}-{3:3}:
[  140.026670]        __lock_acquire+0x1298/0x2460
[  140.026673]        lock_acquire+0xab/0x380
[  140.026675]        __mutex_lock+0xac/0x9e0
[  140.026676]        mutex_lock_nested+0x1b/0x20
[  140.026719]        zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.026803]        zil_lwb_write_issue+0x265/0x3f0 [zfs]
[  140.026886]        zil_commit_impl+0x577/0x1850 [zfs]
[  140.026969]        zil_commit+0x43/0x60 [zfs]
[  140.027051]        zpl_writepages+0xf8/0x1a0 [zfs]
[  140.027116]        do_writepages+0x43/0xf0
[  140.027119]        __filemap_fdatawrite_range+0xd5/0x110
[  140.027122]        filemap_write_and_wait_range+0x4b/0xb0
[  140.027172]        zpl_fsync+0x4d/0xb0 [zfs]
[  140.027215]        vfs_fsync_range+0x49/0x80
[  140.027217]        do_fsync+0x3d/0x70
[  140.027220]        __x64_sys_fsync+0x14/0x20
[  140.027223]        do_syscall_64+0x38/0x90
[  140.027226]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  140.027227] 
               other info that might help us debug this:

[  140.027228]  Possible unsafe locking scenario:

[  140.027228]        CPU0                    CPU1
[  140.027229]        ----                    ----
[  140.027229]   lock(&zcw->zcw_lock);
[  140.027231]                                lock(&zilog->zl_lock);
[  140.027232]                                lock(&zcw->zcw_lock);
[  140.027233]   lock(&zilog->zl_lock);
[  140.027234] 
                *** DEADLOCK ***

[  140.027236] 2 locks held by systemd-journal/3974:
[  140.027236]  #0: ffff94ba812612b8 (&zilog->zl_issuer_lock){+.+.}-{3:3}, at: zil_commit_impl+0x4dc/0x1850 [zfs]
[  140.027342]  #1: ffff94ba9ba504f8 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_impl+0x52b/0x1850 [zfs]
[  140.027452] 
               stack backtrace:
[  140.027455] CPU: 13 PID: 25 Comm: systemd-journal Tainted: G           O      5.9.0-rc1+ #51
[  140.027457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[  140.027459] Call Trace:
[  140.027467]  dump_stack+0x9f/0xda
[  140.027472]  print_circular_bug.isra.0.cold+0x13d/0x142
[  140.027476]  check_noncircular+0x169/0x180
[  140.027483]  __lock_acquire+0x1298/0x2460
[  140.027489]  lock_acquire+0xab/0x380
[  140.027545]  ? zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.027609]  __mutex_lock+0xac/0x9e0
[  140.027657]  ? zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.027742]  ? zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.027792]  ? spl_kmem_cache_alloc+0xc2/0x840 [spl]
[  140.027798]  mutex_lock_nested+0x1b/0x20
[  140.027840]  ? zio_buf_alloc+0x30/0x60 [zfs]
[  140.027882]  ? mutex_lock_nested+0x1b/0x20
[  140.027936]  zil_alloc_lwb+0x1df/0x3e0 [zfs]
[  140.028021]  zil_lwb_write_issue+0x265/0x3f0 [zfs]
[  140.028113]  zil_commit_impl+0x577/0x1850 [zfs]
[  140.028199]  zil_commit+0x43/0x60 [zfs]
[  140.028281]  zpl_writepages+0xf8/0x1a0 [zfs]
[  140.028325]  do_writepages+0x43/0xf0
[  140.028327]  ? _raw_spin_unlock+0x23/0x30
[  140.028330]  __filemap_fdatawrite_range+0xd5/0x110
[  140.028333]  filemap_write_and_wait_range+0x4b/0xb0
[  140.028376]  zpl_fsync+0x4d/0xb0 [zfs]
[  140.028418]  vfs_fsync_range+0x49/0x80
[  140.028421]  do_fsync+0x3d/0x70
[  140.028424]  __x64_sys_fsync+0x14/0x20
[  140.028428]  do_syscall_64+0x38/0x90
[  140.028430]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  140.028434] RIP: 0033:0x7f48b7837214
[  140.028436] Code: 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 f1 96 2d 00 8b 00 85 c0 75 13 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3c f3 c3 66 90 53 89 fb 48 83 ec 10 e8 b4 94
[  140.028438] RSP: 002b:00007ffc919c1ef8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[  140.028440] RAX: ffffffffffffffda RBX: 000056389b01cb30 RCX: 00007f48b7837214
[  140.028441] RDX: 00007f48b2f36000 RSI: 000056389b01cd20 RDI: 0000000000000012
[  140.028442] RBP: 00007ffc919c2210 R08: 0005ad650b223649 R09: 000000005f3feab5
[  140.028444] R10: 00007ffc919f6000 R11: 0000000000000246 R12: 00000000ffffffff
[  140.028445] R13: 0000000000000000 R14: 000056389b01cb30 R15: 00007ffc919c21d0

snajpa · 2020-08-21T15:58:44Z

Proposed fix, testing:

diff --git a/module/zfs/zil.c b/module/zfs/zil.c
index 9dc20ba14..cf4d6d9dc 100644
--- a/module/zfs/zil.c
+++ b/module/zfs/zil.c
@@ -2513,8 +2513,8 @@ zil_commit_waiter_timeout(zilog_t *zilog, zil_commit_waiter_t *zcw)
         * elsewhere.
         */
        mutex_exit(&zcw->zcw_lock);
-       mutex_enter(&zilog->zl_issuer_lock);
        mutex_enter(&zcw->zcw_lock);
+       mutex_enter(&zilog->zl_issuer_lock);

snajpa · 2020-08-21T16:03:34Z

Oh, it's not going to be that simple :-D

snajpa · 2020-08-21T19:53:19Z

After noinlineing all functions in zil.c:

[   59.868232] ======================================================
[   59.868233] WARNING: possible circular locking dependency detected
[   59.868236] 5.9.0-rc1+ #51 Tainted: G           O     
[   59.868237] ------------------------------------------------------
[   59.868239] systemd-journal/3685 is trying to acquire lock:
[   59.868241] ffff989c8b1bc070 (&zilog->zl_lock){+.+.}-{3:3}, at: zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.868405] 
               but task is already holding lock:
[   59.868406] ffff989c6da5b1f8 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x1ca/0x670 [zfs]
[   59.868542] 
               which lock already depends on the new lock.

[   59.868543] 
               the existing dependency chain (in reverse order) is:
[   59.868544] 
               -> #1 (&zcw->zcw_lock){+.+.}-{3:3}:
[   59.868552]        __mutex_lock+0xac/0x9e0
[   59.868555]        mutex_lock_nested+0x1b/0x20
[   59.868622]        zil_commit_waiter_skip+0x58/0x210 [zfs]
[   59.868765]        zil_prune_commit_list+0x1c9/0x250 [zfs]
[   59.868895]        zil_commit_writer.isra.0+0x118/0x280 [zfs]
[   59.869023]        zil_commit_impl+0x4a/0x150 [zfs]
[   59.869151]        zil_commit+0x5f/0x170 [zfs]
[   59.869278]        zfs_fsync+0x7c/0x110 [zfs]
[   59.869405]        zpl_fsync+0x93/0x120 [zfs]
[   59.869475]        vfs_fsync_range+0x49/0x80
[   59.869478]        do_fsync+0x3d/0x70
[   59.869482]        __x64_sys_fsync+0x14/0x20
[   59.869488]        do_syscall_64+0x38/0x90
[   59.869493]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   59.869495] 
               -> #0 (&zilog->zl_lock){+.+.}-{3:3}:
[   59.869503]        __lock_acquire+0x1298/0x2460
[   59.869507]        lock_acquire+0xab/0x380
[   59.869510]        __mutex_lock+0xac/0x9e0
[   59.869512]        mutex_lock_nested+0x1b/0x20
[   59.869577]        zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.869715]        zil_lwb_write_issue+0x2e8/0x580 [zfs]
[   59.869843]        zil_commit_waiter_timeout+0x236/0x670 [zfs]
[   59.869972]        zil_commit_waiter+0x2e9/0x4d0 [zfs]
[   59.870100]        zil_commit_impl+0x55/0x150 [zfs]
[   59.870227]        zil_commit+0x5f/0x170 [zfs]
[   59.870354]        zpl_writepages+0xf8/0x1a0 [zfs]
[   59.870423]        do_writepages+0x43/0xf0
[   59.870427]        __filemap_fdatawrite_range+0xd5/0x110
[   59.870430]        filemap_write_and_wait_range+0x4b/0xb0
[   59.870495]        zpl_fsync+0x4d/0x120 [zfs]
[   59.870561]        vfs_fsync_range+0x49/0x80
[   59.870565]        do_fsync+0x3d/0x70
[   59.870568]        __x64_sys_fsync+0x14/0x20
[   59.870571]        do_syscall_64+0x38/0x90
[   59.870575]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   59.870577] 
               other info that might help us debug this:

[   59.870578]  Possible unsafe locking scenario:

[   59.870579]        CPU0                    CPU1
[   59.870580]        ----                    ----
[   59.870580]   lock(&zcw->zcw_lock);
[   59.870583]                                lock(&zilog->zl_lock);
[   59.870585]                                lock(&zcw->zcw_lock);
[   59.870587]   lock(&zilog->zl_lock);
[   59.870589] 
                *** DEADLOCK ***

[   59.870591] 2 locks held by systemd-journal/3685:
[   59.870592]  #0: ffff989c8b1bc2b8 (&zilog->zl_issuer_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x15c/0x670 [zfs]
[   59.870750]  #1: ffff989c6da5b1f8 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x1ca/0x670 [zfs]
[   59.870881] 
               stack backtrace:
[   59.870884] CPU: 11 PID: 22 Comm: systemd-journal Tainted: G           O      5.9.0-rc1+ #51
[   59.870887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   59.870889] Call Trace:
[   59.870896]  dump_stack+0x9f/0xda
[   59.870900]  print_circular_bug.isra.0.cold+0x13d/0x142
[   59.870904]  check_noncircular+0x169/0x180
[   59.870911]  __lock_acquire+0x1298/0x2460
[   59.870917]  lock_acquire+0xab/0x380
[   59.870983]  ? zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.871052]  __mutex_lock+0xac/0x9e0
[   59.871117]  ? zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.871204]  ? spl_kmem_cache_alloc+0xda/0xd50 [spl]
[   59.871286]  ? zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.871355]  mutex_lock_nested+0x1b/0x20
[   59.871357]  ? mutex_lock_nested+0x1b/0x20
[   59.871423]  zil_alloc_lwb+0x1f8/0x4a0 [zfs]
[   59.871553]  zil_lwb_write_issue+0x2e8/0x580 [zfs]
[   59.871715]  zil_commit_waiter_timeout+0x236/0x670 [zfs]
[   59.871844]  zil_commit_waiter+0x2e9/0x4d0 [zfs]
[   59.871984]  zil_commit_impl+0x55/0x150 [zfs]
[   59.872109]  zil_commit+0x5f/0x170 [zfs]
[   59.872257]  zpl_writepages+0xf8/0x1a0 [zfs]
[   59.872323]  do_writepages+0x43/0xf0
[   59.872327]  ? _raw_spin_unlock+0x23/0x30
[   59.872331]  __filemap_fdatawrite_range+0xd5/0x110
[   59.872336]  filemap_write_and_wait_range+0x4b/0xb0
[   59.872415]  zpl_fsync+0x4d/0x120 [zfs]
[   59.872482]  vfs_fsync_range+0x49/0x80
[   59.872487]  do_fsync+0x3d/0x70
[   59.872491]  __x64_sys_fsync+0x14/0x20
[   59.872495]  do_syscall_64+0x38/0x90
[   59.872499]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   59.872503] RIP: 0033:0x7fe5bf0a6214
[   59.872507] Code: 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 f1 96 2d 00 8b 00 85 c0 75 13 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3c f3 c3 66 90 53 89 fb 48 83 ec 10 e8 b4 94
[   59.872509] RSP: 002b:00007ffde604ec88 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[   59.872513] RAX: ffffffffffffffda RBX: 00005585ac058640 RCX: 00007fe5bf0a6214
[   59.872515] RDX: 00007fe5ba7a5000 RSI: 0000000000000003 RDI: 0000000000000015
[   59.872517] RBP: 00007fe5becb4ea0 R08: 0000000000000005 R09: 0000000000000002
[   59.872518] R10: 00005585ac048010 R11: 0000000000000246 R12: 0000000000000002
[   59.872520] R13: 00000000fffffff0 R14: 00005585ac058640 R15: 00007ffde604f0d0

There are three locks at play, when handling ZIL records. First is a zl_issuer_lock, that is supposed to be held, whenever we're issuing a new record to ZIL. Then there is zl_lock to actually protect the in the zilog struct. And then there's a last lock, zcw_lock, which is used to protect the lists of ZIL call-back waiters. The order of locking is supposed to be: zl_issuer_lock -> zcw_lock -> zl_lock; as implied their usage in zil_alloc_lwb and zil_commit_waiter_timeout functions. Function zil_commit_waiter_link_lwb goes against this, it is expecting to be entered with zl_lock already held and only then it is taking the zcw_lock; This patch straightens the locking in zil_commit_waiter_link_lwb to take the zl_lock on its own, correcting the locking order. Following is an attached messsage from the Linux lockdep mechanism describing the potential deadlock - in our production, it was a matter of a few hours to hit this one... (redacted to fit 72 cols) systemd-journal/5561 is trying to acquire lock: (&zilog->zl_lock){+.+.}-{4:4}, at: zil_alloc_lwb+0x1df/0x3e0 but task is already holding lock: (&zcw->zcw_lock){+.+.}-{4:4}, at: zil_commit_impl+0x52b/0x1850 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&zcw->zcw_lock){+.+.}-{4:4}: __mutex_lock+0xac/0x9e0 mutex_lock_nested+0x1b/0x20 zil_commit_waiter_link_lwb+0x51/0x1a0 [zfs] zil_commit_impl+0x12b0/0x1850 [zfs] zil_commit+0x43/0x60 [zfs] zpl_writepages+0xf8/0x1a0 [zfs] do_writepages+0x43/0xf0 __filemap_fdatawrite_range+0xd5/0x110 filemap_write_and_wait_range+0x4b/0xb0 zpl_fsync+0x4d/0xb0 [zfs] vfs_fsync_range+0x49/0x80 do_fsync+0x3d/0x70 __x64_sys_fsync+0x14/0x20 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&zilog->zl_lock){+.+.}-{4:4}: __lock_acquire+0x12b6/0x2480 lock_acquire+0xab/0x380 __mutex_lock+0xac/0x9e0 mutex_lock_nested+0x1b/0x20 zil_alloc_lwb+0x1df/0x3e0 [zfs] zil_lwb_write_issue+0x265/0x3f0 [zfs] zil_commit_impl+0x577/0x1850 [zfs] zil_commit+0x43/0x60 [zfs] zpl_writepages+0xf8/0x1a0 [zfs] do_writepages+0x43/0xf0 __filemap_fdatawrite_range+0xd5/0x110 filemap_write_and_wait_range+0x4b/0xb0 zpl_fsync+0x4d/0xb0 [zfs] vfs_fsync_range+0x49/0x80 do_fsync+0x3d/0x70 __x64_sys_fsync+0x14/0x20 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&zcw->zcw_lock); lock(&zilog->zl_lock); lock(&zcw->zcw_lock); lock(&zilog->zl_lock); *** DEADLOCK *** 2 locks held by systemd-journal/5561: #0: (&zilog->zl_issuer_lock){+.+.}-{4:4}, at: zil_commit_impl+0x4dc... #1: (&zcw->zcw_lock){+.+.}-{4:4}, at: zil_commit_impl+0x52b/0x1850 Fixes openzfs#7038 Fixes openzfs#10440 Signed-off-by: Pavel Snajdr <[email protected]>

snajpa · 2021-01-26T12:09:44Z

It took a while, but the problem has manifested again, running with ~2 months old git HEAD:

[[email protected]:~]# for pid in `ct ps -o pid 18205`; do pid=$(ps aux | grep journald | grep $pid | awk '{ print $2; }'); if [ "$pid" != "" ]; then ps aux | grep $pid; cat /proc/$pid/stack; fi; done
5071175+   12166  0.0  0.0  27432  4504 ?        Ds   11:51   0:00 /lib/systemd/systemd-journald
90308608  121662  0.0  0.0 217132   616 pts/3    Ss+   2020   0:00 agetty --login-program /nix/store/1l5yw8hyfpvcn2a5lzds5nvgaz519n67-shadow-4.8/bin/login --noclear --keep-baud pts/3 115200,38400,9600 vt220
root     3166910  0.0  0.0   6348  2268 pts/194  S+   13:07   0:00 grep 12166
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+  784154  0.0  0.0  27432  4456 ?        Ds   10:27   0:00 /lib/systemd/systemd-journald
root     3169606  0.0  0.0   6348  2272 pts/194  S+   13:08   0:00 grep 784154
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+  857385  0.0  0.0  27432  4692 ?        Ds   12:12   0:00 /lib/systemd/systemd-journald
root     3171751  0.0  0.0   6348  2236 pts/194  S+   13:08   0:00 grep 857385
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+ 1651466  0.0  0.0  27432  4460 ?        Ds   10:48   0:00 /lib/systemd/systemd-journald
root     3174521  0.0  0.0   6348  2176 pts/194  S+   13:08   0:00 grep 1651466
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+ 1703898  0.0  0.0  27432  4680 ?        Ds   12:34   0:00 /lib/systemd/systemd-journald
root     3174676  0.0  0.0   6348  2180 pts/194  S+   13:08   0:00 grep 1703898
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+ 2510625  0.0  0.0  27432  4432 ?        Ds   11:09   0:00 /lib/systemd/systemd-journald
root     3187083  0.0  0.0   6348  2272 pts/194  S+   13:08   0:00 grep 2510625
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+ 2545314  0.0  0.0  27432  4340 ?        Ds   09:24   0:00 /lib/systemd/systemd-journald
root     3188253  0.0  0.0   6348  2264 pts/194  S+   13:08   0:00 grep 2545314
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zpl_writepages+0xdd/0x180 [zfs]
[<0>] do_writepages+0x43/0xd0
[<0>] __filemap_fdatawrite_range+0xce/0x110
[<0>] filemap_write_and_wait_range+0x26/0x70
[<0>] zpl_fsync+0x37/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
5071175+ 2641300  0.0  0.0  27432  4612 ?        Ds   12:55   0:00 /lib/systemd/systemd-journald
root     3189963  0.0  0.0   6348  2276 pts/194  S+   13:08   0:00 grep 2641300
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
root     3191574  0.0  0.0   6348  2264 pts/194  S+   13:08   0:00 grep 3293685
5071175+ 3293685  0.0  0.0  27432  4440 ?        Ds   11:30   0:00 /lib/systemd/systemd-journald
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
root     3191923  0.0  0.0   6348  2272 pts/194  S+   13:08   0:00 grep 3337412
5071175+ 3337412  0.0  0.0  27432  4432 ?        Ds   09:45   0:00 /lib/systemd/systemd-journald
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
root     3195197  0.0  0.0   6348  2236 pts/194  S+   13:08   0:00 grep 4063111
5071175+ 4063111  0.0  0.0 213280 10808 ?        Dsl   2020  21:40 /lib/systemd/systemd-journald
[<0>] do_coredump+0x33f/0x1070
[<0>] get_signal+0x166/0x8f0
[<0>] arch_do_signal+0x30/0x700
[<0>] exit_to_user_mode_prepare+0xf9/0x160
[<0>] syscall_exit_to_user_mode+0x2c/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
root     3199445  0.0  0.0   6348  2268 pts/194  S+   13:08   0:00 grep 4188468
5071175+ 4188468  0.0  0.0  27432  4440 ?        Ds   10:06   0:00 /lib/systemd/systemd-journald
[<0>] zil_commit_impl+0x147/0xe00 [zfs]
[<0>] zfs_fsync+0x72/0x100 [zfs]
[<0>] zpl_fsync+0x67/0xa0 [zfs]
[<0>] do_fsync+0x38/0x70
[<0>] __x64_sys_fsync+0x10/0x20
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

snajpa · 2021-01-26T22:55:46Z

Well, our issue seems to be only relevant when core-dumping - #10875 (comment)

AndyLavr · 2021-11-21T06:28:01Z

Well, our issue seems to be only relevant when core-dumping - #10875 (comment)

The real problem and it does exist - linux-next 20211118 + current zfs:

You can catch this deadlock when debug, use this config:

# The config is based on running daily CI for enterprise Linux distros to
# seek regressions on linux-next builds on different bare-metal and virtual
# platforms. It can be used for example,
#
# $ make ARCH=amd64 defconfig debug.config
#
# Keep alphabetically sorted inside each section.
#
# printk and dmesg options
#
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_PRINTK_CALLER=y
CONFIG_PRINTK_TIME=y
CONFIG_SYMBOLIC_ERRNAME=y
#
# Compile-time checks and compiler options
#
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_FRAME_WARN=2048
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
#
# Generic Kernel Debugging Instruments
#
# CONFIG_UBSAN_ALIGNMENT is not set
# CONFIG_UBSAN_DIV_ZERO is not set
# CONFIG_UBSAN_TRAP is not set
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ALLOW_ALL=y
CONFIG_DEBUG_IRQFLAGS=y
CONFIG_UBSAN=y
CONFIG_UBSAN_BOOL=y
CONFIG_UBSAN_BOUNDS=y
CONFIG_UBSAN_ENUM=y
CONFIG_UBSAN_SHIFT=y
CONFIG_UBSAN_UNREACHABLE=y
#
# Memory Debugging
#
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_WX is not set
# CONFIG_KFENCE is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_SLUB_STATS is not set
CONFIG_PAGE_EXTENSION=y
CONFIG_PAGE_OWNER=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
CONFIG_DEBUG_OBJECTS_TIMERS=y
CONFIG_DEBUG_OBJECTS_WORK=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_VIRTUAL=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_VM_PGFLAGS=y
CONFIG_DEBUG_VM_RB=y
CONFIG_DEBUG_VM_VMACACHE=y
CONFIG_GENERIC_PTDUMP=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_VMALLOC=y
CONFIG_PTDUMP_DEBUGFS=y
CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_SLUB_DEBUG_ON=y
#
# Debug Oops, Lockups and Hangs
#
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_TIMEOUT=0
CONFIG_SOFTLOCKUP_DETECTOR=y
#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_PROVE_RAW_LOCK_NESTING is not set
CONFIG_PROVE_LOCKING=y
#
# Debug kernel data structures
#
CONFIG_BUG_ON_DATA_CORRUPTION=y
#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
CONFIG_PROVE_RCU_LIST=y
#
# Tracers
#
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y

[  +0,010730] ======================================================
[  +0,000003] WARNING: possible circular locking dependency detected
[  +0,000003] 5.16.0-generic #20211119 Not tainted
[  +0,000004] ------------------------------------------------------
[  +0,000001] systemd-random-/883 is trying to acquire lock:
[  +0,000005] ffff88812d2ec068 (&zilog->zl_lock){+.+.}-{3:3}, at: zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000321] 
              but task is already holding lock:
[  +0,000002] ffff888131d5d0a0 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x26d/0xbe0 [zfs]
[  +0,000299] 
              which lock already depends on the new lock.

[  +0,000003] 
              the existing dependency chain (in reverse order) is:
[  +0,000002] 
              -> #1 (&zcw->zcw_lock){+.+.}-{3:3}:
[  +0,000012]        __lock_acquire+0xaaf/0x1750
[  +0,000010]        lock_acquire.part.0+0x105/0x350
[  +0,000006]        lock_acquire+0xf1/0x2b0
[  +0,000005]        __mutex_lock+0x151/0x13f0
[  +0,000006]        mutex_lock_nested+0x1b/0x20
[  +0,000006]        zil_commit_waiter_skip+0x6f/0x440 [zfs]
[  +0,000289]        zil_commit_impl+0xf80/0x20e0 [zfs]
[  +0,000279]        zil_commit+0xbc/0x120 [zfs]
[  +0,000284]        zfs_fsync+0x1db/0x3c0 [zfs]
[  +0,000286]        zpl_fsync+0x17e/0x290 [zfs]
[  +0,000282]        vfs_fsync_range+0xf6/0x220
[  +0,000007]        __x64_sys_fsync+0x56/0x90
[  +0,000005]        do_syscall_64+0x5c/0xc0
[  +0,000008]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0,000006] 
              -> #0 (&zilog->zl_lock){+.+.}-{3:3}:
[  +0,000012]        check_prev_add+0x162/0x20b0
[  +0,000007]        validate_chain+0xa9c/0xfc0
[  +0,000005]        __lock_acquire+0xaaf/0x1750
[  +0,000006]        lock_acquire.part.0+0x105/0x350
[  +0,000005]        lock_acquire+0xf1/0x2b0
[  +0,000005]        __mutex_lock+0x151/0x13f0
[  +0,000005]        mutex_lock_nested+0x1b/0x20
[  +0,000005]        zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000282]        zil_lwb_write_issue+0x830/0x12c0 [zfs]
[  +0,000286]        zil_commit_waiter_timeout+0x403/0xbe0 [zfs]
[  +0,000289]        zil_commit_impl+0x8c3/0x20e0 [zfs]
[  +0,000287]        zil_commit+0xbc/0x120 [zfs]
[  +0,000290]        zfs_fsync+0x1db/0x3c0 [zfs]
[  +0,000284]        zpl_fsync+0x17e/0x290 [zfs]
[  +0,000283]        vfs_fsync_range+0xf6/0x220
[  +0,000007]        __x64_sys_fsync+0x56/0x90
[  +0,000005]        do_syscall_64+0x5c/0xc0
[  +0,000005]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0,000006] 
              other info that might help us debug this:

[  +0,000003]  Possible unsafe locking scenario:

[  +0,000002]        CPU0                    CPU1
[  +0,000001]        ----                    ----
[  +0,000002]   lock(&zcw->zcw_lock);
[  +0,000007]                                lock(&zilog->zl_lock);
[  +0,000005]                                lock(&zcw->zcw_lock);
[  +0,000006]   lock(&zilog->zl_lock);
[  +0,000005] 
               *** DEADLOCK ***

[  +0,000002] 2 locks held by systemd-random-/883:
[  +0,000004]  #0: ffff88812d2ec288 (&zilog->zl_issuer_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x1b5/0xbe0 [zfs]
[  +0,000300]  #1: ffff888131d5d0a0 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_commit_waiter_timeout+0x26d/0xbe0 [zfs]
[  +0,000298] 
              stack backtrace:
[  +0,000003] CPU: 7 PID: 883 Comm: systemd-random- Not tainted 5.16.0-generic #20211119
[  +0,000007] Hardware name: Dell Inc. Precision M6600/04YY4M, BIOS A18 09/14/2018
[  +0,000004] Call Trace:
[  +0,000003]  <TASK>
[  +0,000004]  show_stack+0x52/0x58
[  +0,000008]  dump_stack_lvl+0x74/0xa6
[  +0,000007]  dump_stack+0x10/0x12
[  +0,000005]  print_circular_bug.cold+0x15d/0x16c
[  +0,000012]  check_noncircular+0x268/0x310
[  +0,000008]  ? print_circular_bug+0x1d0/0x1d0
[  +0,000007]  ? lock_acquire.part.0+0x105/0x350
[  +0,000006]  ? _raw_spin_lock_irqsave+0xe/0x10
[  +0,000009]  ? check_prev_add+0x20b0/0x20b0
[  +0,000013]  check_prev_add+0x162/0x20b0
[  +0,000005]  ? add_chain_cache+0x26a/0x860
[  +0,000006]  ? call_rcu_zapped+0xb0/0xb0
[  +0,000012]  validate_chain+0xa9c/0xfc0
[  +0,000011]  ? check_prev_add+0x20b0/0x20b0
[  +0,000013]  __lock_acquire+0xaaf/0x1750
[  +0,000015]  lock_acquire.part.0+0x105/0x350
[  +0,000007]  ? zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000288]  ? lock_acquire.part.0+0x105/0x350
[  +0,000007]  ? _raw_spin_lock_irqsave+0xe/0x10
[  +0,000006]  ? rcu_read_unlock+0x50/0x50
[  +0,000010]  ? __kasan_check_read+0x11/0x20
[  +0,000013]  lock_acquire+0xf1/0x2b0
[  +0,000006]  ? zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000291]  __mutex_lock+0x151/0x13f0
[  +0,000006]  ? zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000290]  ? zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000289]  ? _raw_spin_unlock_irqrestore+0xe/0x10
[  +0,000007]  ? create_object.isra.0+0x5a7/0xae0
[  +0,000009]  ? mutex_lock_io_nested+0x1240/0x1240
[  +0,000009]  ? rcu_read_lock_sched_held+0x46/0x80
[  +0,000008]  ? kmem_cache_alloc+0x326/0x3f0
[  +0,000012]  ? spl_kmem_cache_alloc+0x2a1/0x3d0 [spl]
[  +0,000030]  mutex_lock_nested+0x1b/0x20
[  +0,000005]  ? zio_buf_alloc+0x4f/0xa0 [zfs]
[  +0,000286]  ? mutex_lock_nested+0x1b/0x20
[  +0,000007]  zil_alloc_lwb+0x4a3/0xd20 [zfs]
[  +0,000289]  ? dsl_dataset_dirty+0x122/0x390 [zfs]
[  +0,232318]  zil_lwb_write_issue+0x830/0x12c0 [zfs]
[  +0,001872] systemd[1]: Started Journal Service.
[  -0,000061]  ? zil_lwb_add_block+0x8e0/0x8e0 [zfs]
[  +0,003970]  zil_commit_waiter_timeout+0x403/0xbe0 [zfs]
[  +0,001912]  zil_commit_impl+0x8c3/0x20e0 [zfs]
[  +0,000175]  ? __mutex_unlock_slowpath+0x160/0x3c0
[  +0,000012]  ? zil_itx_assign+0x1180/0x1180 [zfs]
[  +0,000165]  ? _raw_spin_unlock+0x23/0x40
[  +0,000005]  ? rrw_enter_read_impl+0x3e9/0x830 [zfs]
[  +0,000160]  zil_commit+0xbc/0x120 [zfs]
[  +0,000163]  zfs_fsync+0x1db/0x3c0 [zfs]
[  +0,000171]  zpl_fsync+0x17e/0x290 [zfs]
[  +0,000163]  ? zpl_iter_write+0x5e0/0x5e0 [zfs]
[  +0,000152]  vfs_fsync_range+0xf6/0x220
[  +0,000005]  ? __fget_light+0x57/0x240
[  +0,000003]  ? syscall_enter_from_user_mode+0x21/0x60
[  +0,000006]  __x64_sys_fsync+0x56/0x90
[  +0,000004]  do_syscall_64+0x5c/0xc0
[  +0,000005]  ? lockdep_hardirqs_on_prepare.part.0+0x1be/0x370
[  +0,000006]  ? irqentry_exit_to_user_mode+0x29/0x30
[  +0,000003]  ? irqentry_exit+0x63/0x90
[  +0,000003]  ? exc_page_fault+0x86/0xf0
[  +0,000003]  ? asm_exc_page_fault+0x8/0x30
[  +0,000005]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0,000004] RIP: 0033:0x7f86c54468e7
[  +0,000005] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 5f f7 ff
[  +0,000003] RSP: 002b:00007ffecef47df8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[  +0,000005] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f86c54468e7
[  +0,000003] RDX: 0000000000000200 RSI: 0000000000000200 RDI: 0000000000000004
[  +0,000002] RBP: 0000000000000004 R08: 000055be174c1a00 R09: 0000000000000000
[  +0,000002] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000005
[  +0,000002] R13: 0000000000000000 R14: 000055be174c1cf0 R15: 0000000000000200
[  +0,000007]  </TASK>

I will report the result after testing this fix.
Fix zfs_fsync deadlock

AndyLavr · 2021-11-21T09:45:22Z

The situation changed after the patch, TRAP disappeared, but the deadlock end remained.

[   46.278264] ======================================================
[   46.278268] WARNING: possible circular locking dependency detected
[   46.278270] 5.16.0-generic #20211121 Not tainted
[   46.278272] ------------------------------------------------------
[   46.278273] z_ioctl_int/427 is trying to acquire lock:
[   46.278276] ffff88814a9ecd88 (&zcw->zcw_lock){+.+.}-{3:3}, at: zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.302996] 
               but task is already holding lock:
[   46.303001] ffff88812d6f1068 (&zilog->zl_lock){+.+.}-{3:3}, at: zil_lwb_flush_vdevs_done+0x204/0x1270 [zfs]
[   46.315210] 
               which lock already depends on the new lock.

[   46.315222] 
               the existing dependency chain (in reverse order) is:
[   46.315223] 
               -> #1 (&zilog->zl_lock){+.+.}-{3:3}:
[   46.315232]        __lock_acquire+0xaaf/0x1750
[   46.315241]        lock_acquire.part.0+0x105/0x350
[   46.315244]        lock_acquire+0xf1/0x2b0
[   46.315248]        __mutex_lock+0x151/0x13f0
[   46.346712]        mutex_lock_nested+0x1b/0x20
[   46.346729]        zil_commit_waiter_link_lwb+0x137/0x930 [zfs]
[   46.352404]        zil_lwb_commit+0x8a7/0xc30 [zfs]
[   46.352595]        zil_process_commit_list+0x3df/0x1700 [zfs]
[   46.352758]        zil_commit_impl+0x1755/0x22b0 [zfs]
[   46.352919]        zil_commit+0xbc/0x120 [zfs]
[   46.353081]        zfs_fsync+0x1db/0x3c0 [zfs]
[   46.353242]        zpl_fsync+0x17e/0x290 [zfs]
[   46.353411]        vfs_fsync_range+0xf6/0x220
[   46.353416]        __x64_sys_fsync+0x56/0x90
[   46.353419]        do_syscall_64+0x5c/0xc0
[   46.353424]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[   46.353429] 
               -> #0 (&zcw->zcw_lock){+.+.}-{3:3}:
[   46.353438]        check_prev_add+0x162/0x20b0
[   46.353442]        validate_chain+0xa9c/0xfc0
[   46.353446]        __lock_acquire+0xaaf/0x1750
[   46.353449]        lock_acquire.part.0+0x105/0x350
[   46.353452]        lock_acquire+0xf1/0x2b0
[   46.353455]        __mutex_lock+0x151/0x13f0
[   46.353458]        mutex_lock_nested+0x1b/0x20
[   46.353461]        zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.353623]        zio_done+0xcbb/0x5600 [zfs]
[   46.353785]        zio_execute+0x1c5/0x450 [zfs]
[   46.353947]        taskq_thread+0x9de/0x15d0 [spl]
[   46.353963]        kthread+0x35d/0x420
[   46.353966]        ret_from_fork+0x22/0x30
[   46.353971] 
               other info that might help us debug this:

[   46.353973]  Possible unsafe locking scenario:

[   46.353974]        CPU0                    CPU1
[   46.353975]        ----                    ----
[   46.353976]   lock(&zilog->zl_lock);
[   46.353980]                                lock(&zcw->zcw_lock);
[   46.353983]                                lock(&zilog->zl_lock);
[   46.353986]   lock(&zcw->zcw_lock);
[   46.353989] 
                *** DEADLOCK ***

[   46.353990] 1 lock held by z_ioctl_int/427:
[   46.353993]  #0: ffff88812d6f1068 (&zilog->zl_lock){+.+.}-{3:3}, at: zil_lwb_flush_vdevs_done+0x204/0x1270 [zfs]
[   46.354163] 
               stack backtrace:
[   46.354165] CPU: 1 PID: 427 Comm: z_ioctl_int Not tainted 5.16.0-generic #20211121
[   46.354169] Hardware name: Dell Inc. Precision M6600/04YY4M, BIOS A18 09/14/2018
[   46.354172] Call Trace:
[   46.354174]  <TASK>
[   46.354176]  show_stack+0x52/0x58
[   46.354182]  dump_stack_lvl+0x74/0xa6
[   46.354187]  dump_stack+0x10/0x12
[   46.354190]  print_circular_bug.cold+0x15d/0x16c
[   46.354197]  check_noncircular+0x268/0x310
[   46.354202]  ? print_circular_bug+0x1d0/0x1d0
[   46.354208]  ? ret_from_fork+0x22/0x30
[   46.354215]  check_prev_add+0x162/0x20b0
[   46.354219]  ? add_chain_cache+0x26a/0x860
[   46.354223]  ? call_rcu_zapped+0xb0/0xb0
[   46.354229]  validate_chain+0xa9c/0xfc0
[   46.354235]  ? check_prev_add+0x20b0/0x20b0
[   46.354242]  __lock_acquire+0xaaf/0x1750
[   46.354246]  ? zio_data_buf_free+0x52/0xa0 [zfs]
[   46.354410]  ? zil_lwb_flush_vdevs_done+0x59d/0x1270 [zfs]
[   46.354572]  ? zio_done+0xcbb/0x5600 [zfs]
[   46.354752]  lock_acquire.part.0+0x105/0x350
[   46.354757]  ? zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.354922]  ? rcu_read_unlock+0x50/0x50
[   46.354926]  ? _raw_spin_unlock_irqrestore+0xe/0x10
[   46.354932]  ? mark_held_locks+0xa4/0xf0
[   46.354936]  ? __kasan_slab_free+0xde/0x110
[   46.354942]  lock_acquire+0xf1/0x2b0
[   46.354946]  ? zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.355112]  __mutex_lock+0x151/0x13f0
[   46.355116]  ? zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.355279]  ? __kasan_slab_free+0xde/0x110
[   46.355284]  ? zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.355447]  ? spl_kmem_cache_free+0x3ad/0x6a0 [spl]
[   46.355463]  ? mutex_lock_io_nested+0x1240/0x1240
[   46.355466]  ? kmem_cache_free+0x1bd/0x3d0
[   46.355472]  ? lockdep_hardirqs_on_prepare.part.0+0x1be/0x370
[   46.355479]  ? spl_kmem_cache_free+0x3f0/0x6a0 [spl]
[   46.355495]  mutex_lock_nested+0x1b/0x20
[   46.355498]  ? mutex_lock_nested+0x1b/0x20
[   46.355502]  zil_lwb_flush_vdevs_done+0x67e/0x1270 [zfs]
[   46.511317]  ? zil_create+0x670/0x670 [zfs]
[   46.513140]  ? zio_wait_for_children+0x1cb/0x4d0 [zfs]
[   46.513348] systemd[1]: Started Journal Service.
[   46.514977]  zio_done+0xcbb/0x5600 [zfs]
[   46.518333]  zio_execute+0x1c5/0x450 [zfs]
[   46.520161]  taskq_thread+0x9de/0x15d0 [spl]
[   46.520182]  ? taskq_thread_spawn+0x170/0x170 [spl]
[   46.520197]  ? wake_up_q+0x100/0x100
[   46.520203]  ? lock_acquire.part.0+0x105/0x350
[   46.520208]  ? _raw_spin_lock_irqsave+0xe/0x10
[   46.520213]  ? find_held_lock+0x35/0x130
[   46.520225]  ? zio_gang_tree_free+0xf0/0xf0 [zfs]
[   46.532048]  ? __kasan_check_read+0x11/0x20
[   46.532055]  ? __kthread_parkme+0xcb/0x1e0
[   46.532067]  ? __kasan_check_read+0x11/0x20
[   46.532070]  ? taskq_thread_spawn+0x170/0x170 [spl]
[   46.532086]  kthread+0x35d/0x420
[   46.540627]  ? set_kthread_struct+0x110/0x110
[   46.540632]  ret_from_fork+0x22/0x30
[   46.540650]  </TASK>

snajpa · 2021-11-21T10:19:05Z

@AndyLavr I'm not sure the issues are actually related, it's been a while and all I recall now is a good salad from the mix of info I had at the time :D

Are you experiencing this with any other linux revisions than the pretty recent -next?

Btw, do you think you'd be able to distill a reproducer from the workload you have?

AndyLavr · 2021-11-21T10:35:05Z

@snajpa

Are you experiencing this with any other linux revisions than the pretty recent -next?

I was debugging another issue and accidentally saw this. I only use -next.

Btw, do you think you'd be able to distill a reproducer from the workload you have?

This is one message during boot. I will send you dmesg by email.

snajpa · 2021-11-21T12:48:52Z

Ah, OK, I see, that's a lockdep warning about a potential deadlock; if you're not really hitting it, I guess you can ignore that one :)

I haven't seen the actual deadlock occur again, so it must be resolved for quite some time now. :)

stale · 2022-11-22T20:06:22Z

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jun 11, 2020

behlendorf mentioned this issue Jun 11, 2020

zil_commit_waiter can stall forever openzfsonosx/openzfs#3

Closed

snajpa mentioned this issue Aug 30, 2020

OverlayFS support (d_revalidate out and support renameat2 flags) #9414

Closed

12 tasks

snajpa mentioned this issue Sep 3, 2020

Fix zfs_fsync deadlock #10875

Closed

12 tasks

4oo4 mentioned this issue Oct 9, 2020

Consistent General Protection Fault; zfs_rangelock_enter() -> avl_nearest() -> avl_walk() #10593

Closed

stale bot added the Status: Stale No recent activity for issue label Nov 22, 2022

stale bot closed this as completed Mar 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process stuck at zfs_fsync #10440

Process stuck at zfs_fsync #10440

sebu06 commented Jun 11, 2020

sebu06 commented Jun 11, 2020

sebu06 commented Jun 11, 2020

behlendorf commented Jun 11, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Jan 26, 2021

snajpa commented Jan 26, 2021

AndyLavr commented Nov 21, 2021 •

edited

Loading

AndyLavr commented Nov 21, 2021

snajpa commented Nov 21, 2021

AndyLavr commented Nov 21, 2021 •

edited

Loading

snajpa commented Nov 21, 2021

stale bot commented Nov 22, 2022

Process stuck at zfs_fsync #10440

Process stuck at zfs_fsync #10440

Comments

sebu06 commented Jun 11, 2020

System information

sebu06 commented Jun 11, 2020

sebu06 commented Jun 11, 2020

behlendorf commented Jun 11, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Aug 21, 2020

snajpa commented Jan 26, 2021

snajpa commented Jan 26, 2021

AndyLavr commented Nov 21, 2021 • edited Loading

AndyLavr commented Nov 21, 2021

snajpa commented Nov 21, 2021

AndyLavr commented Nov 21, 2021 • edited Loading

snajpa commented Nov 21, 2021

stale bot commented Nov 22, 2022

AndyLavr commented Nov 21, 2021 •

edited

Loading

AndyLavr commented Nov 21, 2021 •

edited

Loading