Messy locking in zfs_inode_update #4578

lorddoskias · 2016-04-29T10:14:38Z

I'm creating this issue to be able to continue discussion of the problems observed in zfs_update_inode in 4538. So here is what I've discovered:

In zfs_mkdir zfs_inode_update(dzp) is called to update the state of the parent in which we are creating a directory but also zfs_inode_update(zp); is called to update the state of the newly created inode. However, I think this second call is redundant since zfs_inode_update_new(zp) is already called in zfs_mknode.
The same pattern can be observed inzfs_create after the out: label. Is there really a point in calling zfs_inode_update for the newly created zp?

Furthermore, there are cases where zfs_inode_update is rightfully called, yet the i_mutex is not held, so we must lock it. Example:

[   22.987395] VERIFY(mutex_is_locked(&zp->z_inode.i_mutex)) failed
[   22.987930] PANIC at zfs_znode.c:582:zfs_inode_update()
[   22.988500] Showing stack for process 1122
[   22.988858] CPU: 0 PID: 1122 Comm: stat Tainted: G           OE   4.6.0-rc5-nbor #4
[   22.989509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   22.990301]  0000000000000000 ffff880038a8b9d8 ffffffff813c31e3 ffffffffa022d472
[   22.990958]  0000000000000246 ffff880038a8b9e8 ffffffffa0006ab2 ffff880038a8bb70
[   22.991634]  ffffffffa0006b7b 0000000000000001 0000000000000028 ffff880038a8bb80
[   22.992289] Call Trace:
[   22.992512]  [<ffffffff813c31e3>] dump_stack+0x85/0xc2
[   22.992941]  [<ffffffffa0006ab2>] spl_dumpstack+0x42/0x50 [spl]
[   22.993425]  [<ffffffffa0006b7b>] spl_panic+0xbb/0xf0 [spl]
[   22.993866]  [<ffffffff810a2cb6>] ? trace_hardirqs_on_caller+0x16/0x1c0
[   22.994382]  [<ffffffff810a2cb6>] ? trace_hardirqs_on_caller+0x16/0x1c0
[   22.994902]  [<ffffffff810a2e6d>] ? trace_hardirqs_on+0xd/0x10
[   22.995363]  [<ffffffffa000117a>] ? spl_kmem_free+0x2a/0x40 [spl]
[   22.995890]  [<ffffffffa0185129>] ? zfs_dirent_unlock+0x219/0x340 [zfs]
[   22.996454]  [<ffffffffa0185c86>] ? zfs_dirlook+0x136/0x3d0 [zfs]
[   22.996959]  [<ffffffffa01ab2cb>] zfs_inode_update+0x4b/0x60 [zfs]
[   22.997469]  [<ffffffffa019dcd0>] zfs_lookup+0x2c0/0x2e0 [zfs]
[   22.997957]  [<ffffffffa01c8b5e>] zpl_lookup+0xae/0x2b0 [zfs]
[   22.998410]  [<ffffffff811c0cdd>] lookup_real+0x1d/0x60
[   22.998820]  [<ffffffff811c171f>] lookup_slow+0x5f/0xf0
[   22.999230]  [<ffffffff811c49ed>] walk_component+0x1bd/0x280
[   22.999677]  [<ffffffff811c2bb5>] ? path_init+0x555/0x750
[   23.000100]  [<ffffffff811c509d>] path_lookupat+0x5d/0x110
[   23.000530]  [<ffffffff811c6b8a>] filename_lookup+0x9a/0x110
[   23.000975]  [<ffffffff8107b9e9>] ? __might_sleep+0x49/0x80
[   23.001460]  [<ffffffff8119a9b1>] ? kmem_cache_alloc+0x1f1/0x2d0
[   23.001942]  [<ffffffff8117a1a8>] ? handle_mm_fault+0xd18/0x1ac0
[   23.002412]  [<ffffffff811c6863>] ? getname_flags+0x53/0x1a0
[   23.002856]  [<ffffffff811c6cb6>] user_path_at_empty+0x36/0x40
[   23.003310]  [<ffffffff811bbaf3>] vfs_fstatat+0x53/0xa0
[   23.003732]  [<ffffffff8109eb1f>] ? up_read+0x1f/0x40
[   23.004127]  [<ffffffff811bc017>] SyS_newlstat+0x27/0x40
[   23.004543]  [<ffffffff810a2cb6>] ? trace_hardirqs_on_caller+0x16/0x1c0
[   23.005059]  [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
[   23.005562]  [<ffffffff815fed40>] entry_SYSCALL_64_fastpath+0x23/0xc1
[   23.006063]  [<ffffffff8109ff7f>] ? trace_hardirqs_off_caller+0x1f/0xc0

Or

[   54.521005]  [<ffffffff813c31e3>] dump_stack+0x85/0xc2
[   54.521409]  [<ffffffffa0006ab2>] spl_dumpstack+0x42/0x50 [spl]
[   54.521867]  [<ffffffffa0006b7b>] spl_panic+0xbb/0xf0 [spl]
[   54.522300]  [<ffffffff810a2cb6>] ? trace_hardirqs_on_caller+0x16/0x1c0
[   54.522824]  [<ffffffff815fc93b>] ? __mutex_unlock_slowpath+0xeb/0x1a0
[   54.523319]  [<ffffffff810a2cb6>] ? trace_hardirqs_on_caller+0x16/0x1c0
[   54.523845]  [<ffffffff810a2e6d>] ? trace_hardirqs_on+0xd/0x10
[   54.524283]  [<ffffffffa000117a>] ? spl_kmem_free+0x2a/0x40 [spl]
[   54.524778]  [<ffffffffa01ab2fb>] zfs_inode_update+0x4b/0x60 [zfs]
[   54.525269]  [<ffffffffa01a4862>] zfs_write+0x2a2/0xf50 [zfs]
[   54.525715]  [<ffffffff8109ff7f>] ? trace_hardirqs_off_caller+0x1f/0xc0
[   54.526256]  [<ffffffffa01c6cef>] zpl_write_common_iovec+0x7f/0x100 [zfs]
[   54.526793]  [<ffffffffa01c6f4d>] zpl_write+0x7d/0xa0 [zfs]
[   54.527223]  [<ffffffff811b5d28>] __vfs_write+0x28/0xf0
[   54.527617]  [<ffffffff8109ecfe>] ? update_fast_ctr+0x1e/0x40
[   54.528050]  [<ffffffff8109eda7>] ? percpu_down_read+0x57/0xa0
[   54.528489]  [<ffffffff811ba437>] ? __sb_start_write+0xb7/0xf0
[   54.528989]  [<ffffffff811ba437>] ? __sb_start_write+0xb7/0xf0
[   54.529458]  [<ffffffff811b60cb>] vfs_write+0xbb/0x1f0
[   54.529874]  [<ffffffff811b7e39>] SyS_write+0x49/0xb0
[   54.530269]  [<ffffffff815fed40>] entry_SYSCALL_64_fastpath+0x23/0xc1
[   54.530766]  [<ffffffff8109ff7f>] ? trace_hardirqs_off_caller+0x1f/0xc0

I guess we really need to do a mutex_trylock/lock in zfs_inode_update. Or identify all such sites and lock the mutex before calling into zfs_inode_update

The text was updated successfully, but these errors were encountered:

tuxoko · 2016-04-29T23:40:12Z

The correct way to do is to to remove this function entirely. When Linux VFS wants to update the fields in inode, it will grab the required lock. We really shouldn't be messing around the field in inode and i_mutex.

behlendorf · 2016-04-30T00:04:17Z

Removing it would be best and it what I always intended, see issue #227. It has always been an ugly solution.

lorddoskias · 2016-04-30T08:12:16Z

I agree with you, but until this time comes (i intend to work on this) i think the locking needs to be fixed.

lorddoskias mentioned this issue May 9, 2016

Zfs inode update locking #4623

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Messy locking in zfs_inode_update #4578

Messy locking in zfs_inode_update #4578

lorddoskias commented Apr 29, 2016 •

edited

Loading

tuxoko commented Apr 29, 2016

behlendorf commented Apr 30, 2016

lorddoskias commented Apr 30, 2016

Messy locking in zfs_inode_update #4578

Messy locking in zfs_inode_update #4578

Comments

lorddoskias commented Apr 29, 2016 • edited Loading

tuxoko commented Apr 29, 2016

behlendorf commented Apr 30, 2016

lorddoskias commented Apr 30, 2016

lorddoskias commented Apr 29, 2016 •

edited

Loading