Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS causes XFS warnings due to PF_FSTRANS #4529

Closed
lorddoskias opened this issue Apr 15, 2016 · 15 comments
Closed

ZFS causes XFS warnings due to PF_FSTRANS #4529

lorddoskias opened this issue Apr 15, 2016 · 15 comments

Comments

@lorddoskias
Copy link
Contributor

Hello I observed the following this morning on one server running zfs inside a loop file, hosted on XFS:

------------[ cut here ]------------
WARNING: CPU: 23 PID: 10852 at fs/xfs/xfs_aops.c:982 xfs_vm_writepage+0x58b/0x5d0 [xfs]()
Modules linked in: dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio xt_multiport xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables zfs(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) zunicode(PO) xfs ext2 dm_mirror dm_region_hash dm_log ib_umad ib_ipoib ib_cm ib_sa ib_mad ib_core ib_addr ipv6 sb_edac edac_core i2c_i801 lpc_ich mfd_core shpchp ioatdma ses enclosure ixgbe dca mdio ipmi_devintf ipmi_si ipmi_msghandler tcp_scalable
CPU: 23 PID: 10852 Comm: z_ioctl_iss Tainted: P           O    4.4.1-clouder2 #69
Hardware name: Supermicro PIO-628U-TR4T+-ST031/X10DRU-i+, BIOS 1.0c 03/23/2015
 00000000000003d6 ffff880828bd3728 ffffffff81339b7f 0000000000002a64
 0000000000000000 00000000000003d6 0000000000000000 ffff880828bd3768
 ffffffff810a1ed5 ffff880800000000 ffff88082c9875f8 ffffea003f57b080
Call Trace:
 [<ffffffff81339b7f>] dump_stack+0x4f/0x80
 [<ffffffff810a1ed5>] warn_slowpath_common+0x95/0xe0
 [<ffffffff810a1f3a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0231fdb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
 [<ffffffff81169438>] ? find_get_pages_tag+0xe8/0x1a0
 [<ffffffff811a6e90>] ? page_mapped_in_vma+0x60/0x60
 [<ffffffff81176b1d>] ? clear_page_dirty_for_io+0xfd/0x1e0
 [<ffffffff81173ed7>] __writepage+0x17/0x40
 [<ffffffff81176f81>] write_cache_pages+0x251/0x530
 [<ffffffff81173ec0>] ? set_page_dirty+0x70/0x70
 [<ffffffff811772b1>] generic_writepages+0x51/0x80
 [<ffffffffa0230cb0>] xfs_vm_writepages+0x60/0x80 [xfs]
 [<ffffffff81177300>] do_writepages+0x20/0x30
 [<ffffffff8116a5f5>] __filemap_fdatawrite_range+0xb5/0x100
 [<ffffffff8116a6cb>] filemap_write_and_wait_range+0x8b/0xd0
 [<ffffffffa0235bb4>] xfs_free_file_space+0xf4/0x520 [xfs]
 [<ffffffff8167bdd4>] ? down_write+0x24/0x60
 [<ffffffffa024a638>] ? xfs_ilock+0x88/0x110 [xfs]
 [<ffffffffa023cbce>] xfs_file_fallocate+0x19e/0x2c0 [xfs]
 [<ffffffff816792d6>] ? __schedule+0x2c6/0x9d0
 [<ffffffffa036c6fc>] vn_space+0x3c/0x40 [spl]
 [<ffffffffa0434817>] vdev_file_io_start+0x207/0x260 [zfs]
 [<ffffffffa047170d>] zio_vdev_io_start+0xad/0x2d0 [zfs]
 [<ffffffffa0474942>] zio_execute+0x82/0xe0 [zfs]
 [<ffffffffa036ba7d>] taskq_thread+0x28d/0x5a0 [spl]
 [<ffffffff810cb5c0>] ? try_to_wake_up+0x3b0/0x3b0
 [<ffffffffa036b7f0>] ? param_set_taskq_kick+0x110/0x110 [spl]
 [<ffffffffa036b7f0>] ? param_set_taskq_kick+0x110/0x110 [spl]
 [<ffffffff810c1777>] kthread+0xd7/0xf0
 [<ffffffff810ca3ee>] ? schedule_tail+0x1e/0xd0
 [<ffffffff810c16a0>] ? kthread_freezable_should_stop+0x80/0x80
 [<ffffffff8167de2f>] ret_from_fork+0x3f/0x70
 [<ffffffff810c16a0>] ? kthread_freezable_should_stop+0x80/0x80
---[ end trace 69af8feadd474012 ]---

Line 982 of xfs_aops is this:

        /*
         * Given that we do not allow direct reclaim to call us, we should
         * never be called while in a filesystem transaction.
         */
        if (WARN_ON_ONCE(current->flags & PF_FSTRANS))
                goto redirty;

so PF_FSTRANS is set in taskq_thread by spl_fstrans_mark(). I can see that there is already code which handles this case for fsync in vn_fsync:

/*
         * May enter XFS which generates a warning when PF_FSTRANS is set.
         * To avoid this the flag is cleared over vfs_sync() and then reset.
         */
        fstrans = spl_fstrans_check();
        if (fstrans)
                current->flags &= ~(PF_FSTRANS);

        error = -spl_filp_fsync(vp->v_file, datasync);

Maybe the same check should be implemented invn_space as well? Or given that XFS is the only filesystem in the upstream kernel that uses the PF_FSTRANS maybe there can be a more general check which disables the flag if the underlying filesystem is somehow detected to be XFS? Though I admittedly don't know how easy/portable that would be.

@behlendorf
Copy link
Contributor

Maybe the same check should be implemented invn_space as well?

Yes, we should definitely add a similar check to vn_space(). And as you suggested we could fairly easily limit this to xfs filesystems by checkingstrcmp(vp->v_file->f_dentry->d_inode->i_sb->s_type->name, "xfs"). This particular data structure hasn't changed in many many years so it's not a huge risk for future portability issues, but obviously there are no guarantees.

@lorddoskias
Copy link
Contributor Author

Ok so I had a look around the code and here is what I think should/could happen. First and foremost I think this warning can be invoked by any operations which might cause writeback. So that's why I think the correct way forward would be to disable it globally just for XFS. I think the best way would be to pass an argument to spl_fstrans_mark which would then make the correct decision. Now, the pertinent question is does spl_fstrans_mark has access to the struct inode (either directly or indirectly) from every context it's called. Also what's the difference between the vn_* operations and zpl_* ones e.g. vn_fsync and zpl_fsync ? Maybe only the vn_* operations should be patched?

@lorddoskias
Copy link
Contributor Author

@behlendorf I created spl/#542. Which does the easy fix in vn_space, but without further info I'm afraid I cannot create the xfs-encompassing one.

@dweeezil
Copy link
Contributor

@lorddoskias Do I gather you're using the TRIM patches and that's when you get the warning? The reason I ask is that without the TRIM patch stack, VOP_SPACE (a.k.a. vn_space) is not used. I've got a slightly different approach from openzfs/spl#542 in dweeezil/spl@9ceb48f which I like a bit better (we could also use it in vn_fsync()).

@lorddoskias
Copy link
Contributor Author

lorddoskias commented Apr 18, 2016

@dweeezil Unfortunately I cannot say with 100% certainty wheter this happens while zpool trim is running. I might create a test bed and see if this is the case. What I can assure is that the trim patches are applied indeed.

So I like your approach, however, one thing which I wasn't sure about was whether in every context that spl_fstrans_mark is used it is known whether it is safe to enable/disable the PF_FSTRANS flag. I see your patch doesn't really solve this issue, it just gives a "neat" way of either setting or unsetting it. And as I said before - this warning can be triggered from any context which might force writeback on the underlying filesystem, I doubt it it is just in those 2 functions.

@dweeezil
Copy link
Contributor

@lorddoskias I figured we could use spl_fstrans_set() in the DKIOCFREE handler in vdev_file_io_start(), possibly with a hack to only do it for XFS but maybe for all filesystems.

@lorddoskias
Copy link
Contributor Author

@dweeezil so do you intend on submitting your changes as well as adding the call to spl_fstrans_set added in the vn_fsync/vn_space functions?

@behlendorf what's your opinion, which approach do you intend? Admittedly @dweeezil's approach is more systematic and cleaner than mine but I'd like to see the issue resolved before long.

@dweeezil
Copy link
Contributor

@lorddoskias I was thinking about this again while working on #3656 yesterday. I think a better solution for the time being, and more congruent with the existing hack in vdev_file_io_start() is to extend the same hack to the DKIOCFREE handler. To that end, I'll layer another patch to the stack in #3656 to deal with it.

@dweeezil
Copy link
Contributor

In retrospect, it might not be such a great idea to run the vn_space operations asynchronously and that the 9ceb48f approach may be better and could also be retrofitted into the DKIOCFLUSHWRITECACHE handler. I'll post some pull requests today.

@dweeezil
Copy link
Contributor

Both #3656 and openzfs/spl#543 have been updated to, respectively, use and add spl_fstrans_set().

@behlendorf
Copy link
Contributor

My inclination would be to go with @lorddoskias proposed patch to vn_space() openzfs/spl@aa14008. This is a Linux specific issue which all the callers of vn_space() really shouldn't have to worry about. By making the fix here, similar to vn_sync() we can keep this out of the ZFS code proper.

@dweeezil I'm not sure I understand the motivation behind adding spl_fstrans_set() for this. Could you explain you're alternate approach and why you prefer it?

@dweeezil
Copy link
Contributor

@behlendorf I see the issue now: The Linux-specific hack wasn't removed from the handler in vdev_file_io_start() when the PF_FSTRANS support was added to vn_fsync(). My rationale was that if a Linux-specific hack was OK in one case of vdev_file_io_start(), then it was OK in another case (DKIOCFREE) of the same function. This type of thing clearly belongs in the wrappers in SPL.

@behlendorf
Copy link
Contributor

Ahh, I see. Yes, we should also verify we can remove the hack from vdev_file_io_start().

@dweeezil
Copy link
Contributor

Both openzfs/spl#543 and #3656 have been updated appropriately.

@dweeezil
Copy link
Contributor

The hack left over in vdev_file_io_start() should be investigated separately and removed in its own PR.

tuxoko pushed a commit to tuxoko/spl that referenced this issue Jun 14, 2016
The problem described in 2a5d574 also applies to XFS's file or inode
fallocate method.  Both paths may trigger writeback and expose this
issue, see the full stack below.

When layered on XFS a warning will be emitted under CentOS7 when entering
either the file or inode fallocate method with PF_FSTRANS already set.
To avoid triggering this error PF_FSTRANS is cleared and then reset
in vn_space().

WARNING: at fs/xfs/xfs_aops.c:982 xfs_vm_writepage+0x58b/0x5d0

Call Trace:
 [<ffffffff810a1ed5>] warn_slowpath_common+0x95/0xe0
 [<ffffffff810a1f3a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0231fdb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
 [<ffffffff81173ed7>] __writepage+0x17/0x40
 [<ffffffff81176f81>] write_cache_pages+0x251/0x530
 [<ffffffff811772b1>] generic_writepages+0x51/0x80
 [<ffffffffa0230cb0>] xfs_vm_writepages+0x60/0x80 [xfs]
 [<ffffffff81177300>] do_writepages+0x20/0x30
 [<ffffffff8116a5f5>] __filemap_fdatawrite_range+0xb5/0x100
 [<ffffffff8116a6cb>] filemap_write_and_wait_range+0x8b/0xd0
 [<ffffffffa0235bb4>] xfs_free_file_space+0xf4/0x520 [xfs]
 [<ffffffffa023cbce>] xfs_file_fallocate+0x19e/0x2c0 [xfs]
 [<ffffffffa036c6fc>] vn_space+0x3c/0x40 [spl]
 [<ffffffffa0434817>] vdev_file_io_start+0x207/0x260 [zfs]
 [<ffffffffa047170d>] zio_vdev_io_start+0xad/0x2d0 [zfs]
 [<ffffffffa0474942>] zio_execute+0x82/0xe0 [zfs]
 [<ffffffffa036ba7d>] taskq_thread+0x28d/0x5a0 [spl]
 [<ffffffff810c1777>] kthread+0xd7/0xf0
 [<ffffffff8167de2f>] ret_from_fork+0x3f/0x70

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Closes openzfs/zfs#4529
tonyhutter pushed a commit to tonyhutter/spl that referenced this issue May 23, 2017
The problem described in 2a5d574 also applies to XFS's file or inode
fallocate method.  Both paths may trigger writeback and expose this
issue, see the full stack below.

When layered on XFS a warning will be emitted under CentOS7 when entering
either the file or inode fallocate method with PF_FSTRANS already set.
To avoid triggering this error PF_FSTRANS is cleared and then reset
in vn_space().

WARNING: at fs/xfs/xfs_aops.c:982 xfs_vm_writepage+0x58b/0x5d0

Call Trace:
 [<ffffffff810a1ed5>] warn_slowpath_common+0x95/0xe0
 [<ffffffff810a1f3a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0231fdb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
 [<ffffffff81173ed7>] __writepage+0x17/0x40
 [<ffffffff81176f81>] write_cache_pages+0x251/0x530
 [<ffffffff811772b1>] generic_writepages+0x51/0x80
 [<ffffffffa0230cb0>] xfs_vm_writepages+0x60/0x80 [xfs]
 [<ffffffff81177300>] do_writepages+0x20/0x30
 [<ffffffff8116a5f5>] __filemap_fdatawrite_range+0xb5/0x100
 [<ffffffff8116a6cb>] filemap_write_and_wait_range+0x8b/0xd0
 [<ffffffffa0235bb4>] xfs_free_file_space+0xf4/0x520 [xfs]
 [<ffffffffa023cbce>] xfs_file_fallocate+0x19e/0x2c0 [xfs]
 [<ffffffffa036c6fc>] vn_space+0x3c/0x40 [spl]
 [<ffffffffa0434817>] vdev_file_io_start+0x207/0x260 [zfs]
 [<ffffffffa047170d>] zio_vdev_io_start+0xad/0x2d0 [zfs]
 [<ffffffffa0474942>] zio_execute+0x82/0xe0 [zfs]
 [<ffffffffa036ba7d>] taskq_thread+0x28d/0x5a0 [spl]
 [<ffffffff810c1777>] kthread+0xd7/0xf0
 [<ffffffff8167de2f>] ret_from_fork+0x3f/0x70

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Closes openzfs/zfs#4529
tonyhutter pushed a commit to openzfs/spl that referenced this issue Jun 9, 2017
The problem described in 2a5d574 also applies to XFS's file or inode
fallocate method.  Both paths may trigger writeback and expose this
issue, see the full stack below.

When layered on XFS a warning will be emitted under CentOS7 when entering
either the file or inode fallocate method with PF_FSTRANS already set.
To avoid triggering this error PF_FSTRANS is cleared and then reset
in vn_space().

WARNING: at fs/xfs/xfs_aops.c:982 xfs_vm_writepage+0x58b/0x5d0

Call Trace:
 [<ffffffff810a1ed5>] warn_slowpath_common+0x95/0xe0
 [<ffffffff810a1f3a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0231fdb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
 [<ffffffff81173ed7>] __writepage+0x17/0x40
 [<ffffffff81176f81>] write_cache_pages+0x251/0x530
 [<ffffffff811772b1>] generic_writepages+0x51/0x80
 [<ffffffffa0230cb0>] xfs_vm_writepages+0x60/0x80 [xfs]
 [<ffffffff81177300>] do_writepages+0x20/0x30
 [<ffffffff8116a5f5>] __filemap_fdatawrite_range+0xb5/0x100
 [<ffffffff8116a6cb>] filemap_write_and_wait_range+0x8b/0xd0
 [<ffffffffa0235bb4>] xfs_free_file_space+0xf4/0x520 [xfs]
 [<ffffffffa023cbce>] xfs_file_fallocate+0x19e/0x2c0 [xfs]
 [<ffffffffa036c6fc>] vn_space+0x3c/0x40 [spl]
 [<ffffffffa0434817>] vdev_file_io_start+0x207/0x260 [zfs]
 [<ffffffffa047170d>] zio_vdev_io_start+0xad/0x2d0 [zfs]
 [<ffffffffa0474942>] zio_execute+0x82/0xe0 [zfs]
 [<ffffffffa036ba7d>] taskq_thread+0x28d/0x5a0 [spl]
 [<ffffffff810c1777>] kthread+0xd7/0xf0
 [<ffffffff8167de2f>] ret_from_fork+0x3f/0x70

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Closes openzfs/zfs#4529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants