forked from openzfs/zfs
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for https://github.com/zfsonlinux/zfs/issues/581 #1
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for the patch, I'll look it over tomorrow. |
Ah, didn't see that you saw it yet. Thought i messed up by putting it into your repo, so i just added it to zfsonlinux/zfs. Sorry in case that causes any trouble. |
No trouble, but yes people typically submit against zfsonlinux. |
behlendorf
pushed a commit
that referenced
this pull request
Feb 13, 2014
ZoL commit 1421c89 unintentionally changed the disk format in a forward- compatible, but not backward compatible way. This was accomplished by adding an entry to zbookmark_t, which is included in a couple of on-disk structures. That lead to the creation of pools with incorrect dsl_scan_phys_t objects that could only be imported by versions of ZoL containing that commit. Such pools cannot be imported by other versions of ZFS or past versions of ZoL. The additional field has been removed by the previous commit. However, affected pools must be imported and scrubbed using a version of ZoL with this commit applied. This will return the pools to a state in which they may be imported by other implementations. The 'zpool import' or 'zpool status' command can be used to determine if a pool is impacted. A message similar to one of the following means your pool must be scrubbed to restore compatibility. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can be imported using its name or numeric identifier, however there is a compatibility issue which should be corrected by running 'zpool scrub' see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... $ zpool status pool: zol-0.6.2-173 state: ONLINE scan: pool compatibility issue detected. see: openzfs#2094 action: To correct the issue run 'zpool scrub'. config: ... If there was an async destroy in progress 'zpool import' will prevent the pool from being imported. Further advice on how to proceed will be provided by the error message as follows. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can not be imported with this version of ZFS due to an active asynchronous destroy. Revert to an earlier version and allow the destroy to complete before updating. see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... Pools affected by the damaged dsl_scan_phys_t can be detected prior to an upgrade by running the following command as root: zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w Note that `poolname` must be replaced with the name of the pool you wish to check. A value of 25 indicates the dsl_scan_phys_t has been damaged. A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0 indicates that there has never been a scrub run on the pool. The regression caused by the change to zbookmark_t never made it into a tagged release or any Gentoo backports. Only those using HEAD were affected. However, this patch has a few limitations. There is no way to detect a damaged dsl_scan_phys_t object when it has occurred on 32-bit systems due to integer rounding that wrote incorrect information, but avoided the overflow on them. Correcting such issues requires triggering a scrub. In addition, bptree_entry_phys_t objects are also affected. These objects only exist during an asynchronous destroy and automating repair of damaged bptree_entry_phys_t objects is non-trivial. Any pools that have been imported by an affected version of ZoL must have all asynchronous destroy operations finish before export and subsequent import by a version containing this commit. Failure to do that will prevent pool import. The presence of any background destroys on any imported pools can be checked by running `zpool get freeing` as root. This will display a non-zero value for any pool with an active asynchronous destroy. Lastly, it is expected that no user data has been lost as a result of this erratum. Original-patch-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
behlendorf
pushed a commit
that referenced
this pull request
Feb 14, 2014
ZoL commit 1421c89 unintentionally changed the disk format in a forward- compatible, but not backward compatible way. This was accomplished by adding an entry to zbookmark_t, which is included in a couple of on-disk structures. That lead to the creation of pools with incorrect dsl_scan_phys_t objects that could only be imported by versions of ZoL containing that commit. Such pools cannot be imported by other versions of ZFS or past versions of ZoL. The additional field has been removed by the previous commit. However, affected pools must be imported and scrubbed using a version of ZoL with this commit applied. This will return the pools to a state in which they may be imported by other implementations. The 'zpool import' or 'zpool status' command can be used to determine if a pool is impacted. A message similar to one of the following means your pool must be scrubbed to restore compatibility. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can be imported using its name or numeric identifier, however there is a compatibility issue which should be corrected by running 'zpool scrub' see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... $ zpool status pool: zol-0.6.2-173 state: ONLINE scan: pool compatibility issue detected. see: openzfs#2094 action: To correct the issue run 'zpool scrub'. config: ... If there was an async destroy in progress 'zpool import' will prevent the pool from being imported. Further advice on how to proceed will be provided by the error message as follows. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can not be imported with this version of ZFS due to an active asynchronous destroy. Revert to an earlier version and allow the destroy to complete before updating. see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... Pools affected by the damaged dsl_scan_phys_t can be detected prior to an upgrade by running the following command as root: zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w Note that `poolname` must be replaced with the name of the pool you wish to check. A value of 25 indicates the dsl_scan_phys_t has been damaged. A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0 indicates that there has never been a scrub run on the pool. The regression caused by the change to zbookmark_t never made it into a tagged release or any Gentoo backports. Only those using HEAD were affected. However, this patch has a few limitations. There is no way to detect a damaged dsl_scan_phys_t object when it has occurred on 32-bit systems due to integer rounding that wrote incorrect information, but avoided the overflow on them. Correcting such issues requires triggering a scrub. In addition, bptree_entry_phys_t objects are also affected. These objects only exist during an asynchronous destroy and automating repair of damaged bptree_entry_phys_t objects is non-trivial. Any pools that have been imported by an affected version of ZoL must have all asynchronous destroy operations finish before export and subsequent import by a version containing this commit. Failure to do that will prevent pool import. The presence of any background destroys on any imported pools can be checked by running `zpool get freeing` as root. This will display a non-zero value for any pool with an active asynchronous destroy. Lastly, it is expected that no user data has been lost as a result of this erratum. Original-patch-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
behlendorf
pushed a commit
that referenced
this pull request
Feb 21, 2014
ZoL commit 1421c89 unintentionally changed the disk format in a forward- compatible, but not backward compatible way. This was accomplished by adding an entry to zbookmark_t, which is included in a couple of on-disk structures. That lead to the creation of pools with incorrect dsl_scan_phys_t objects that could only be imported by versions of ZoL containing that commit. Such pools cannot be imported by other versions of ZFS or past versions of ZoL. The additional field has been removed by the previous commit. However, affected pools must be imported and scrubbed using a version of ZoL with this commit applied. This will return the pools to a state in which they may be imported by other implementations. The 'zpool import' or 'zpool status' command can be used to determine if a pool is impacted. A message similar to one of the following means your pool must be scrubbed to restore compatibility. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can be imported using its name or numeric identifier, however there is a compatibility issue which should be corrected by running 'zpool scrub' see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... $ zpool status pool: zol-0.6.2-173 state: ONLINE scan: pool compatibility issue detected. see: openzfs#2094 action: To correct the issue run 'zpool scrub'. config: ... If there was an async destroy in progress 'zpool import' will prevent the pool from being imported. Further advice on how to proceed will be provided by the error message as follows. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #2 detected. action: The pool can not be imported with this version of ZFS due to an active asynchronous destroy. Revert to an earlier version and allow the destroy to complete before updating. see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... Pools affected by the damaged dsl_scan_phys_t can be detected prior to an upgrade by running the following command as root: zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w Note that `poolname` must be replaced with the name of the pool you wish to check. A value of 25 indicates the dsl_scan_phys_t has been damaged. A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0 indicates that there has never been a scrub run on the pool. The regression caused by the change to zbookmark_t never made it into a tagged release, Gentoo backports, Ubuntu, Debian, Fedora, or EPEL stable respositorys. Only those using the HEAD version directly from Github after the 0.6.2 but before the 0.6.3 tag are affected. This patch does have one limitation that should be mentioned. It will not detect errata #2 on a pool unless errata #1 is also present. It expected this will not be a significant problem because pools impacted by errata #2 have a high probably of being impacted by errata #1. End users can ensure they do no hit this unlikely case by waiting for all asynchronous destroy operations to complete before updating ZoL. The presence of any background destroys on any imported pools can be checked by running `zpool get freeing` as root. This will display a non-zero value for any pool with an active asynchronous destroy. Lastly, it is expected that no user data has been lost as a result of this erratum. Original-patch-by: Tim Chase <[email protected]> Reworked-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
behlendorf
pushed a commit
that referenced
this pull request
Feb 21, 2014
ZoL commit 1421c89 unintentionally changed the disk format in a forward- compatible, but not backward compatible way. This was accomplished by adding an entry to zbookmark_t, which is included in a couple of on-disk structures. That lead to the creation of pools with incorrect dsl_scan_phys_t objects that could only be imported by versions of ZoL containing that commit. Such pools cannot be imported by other versions of ZFS or past versions of ZoL. The additional field has been removed by the previous commit. However, affected pools must be imported and scrubbed using a version of ZoL with this commit applied. This will return the pools to a state in which they may be imported by other implementations. The 'zpool import' or 'zpool status' command can be used to determine if a pool is impacted. A message similar to one of the following means your pool must be scrubbed to restore compatibility. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can be imported using its name or numeric identifier, however there is a compatibility issue which should be corrected by running 'zpool scrub' see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... $ zpool status pool: zol-0.6.2-173 state: ONLINE scan: pool compatibility issue detected. see: openzfs#2094 action: To correct the issue run 'zpool scrub'. config: ... If there was an async destroy in progress 'zpool import' will prevent the pool from being imported. Further advice on how to proceed will be provided by the error message as follows. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #2 detected. action: The pool can not be imported with this version of ZFS due to an active asynchronous destroy. Revert to an earlier version and allow the destroy to complete before updating. see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... Pools affected by the damaged dsl_scan_phys_t can be detected prior to an upgrade by running the following command as root: zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w Note that `poolname` must be replaced with the name of the pool you wish to check. A value of 25 indicates the dsl_scan_phys_t has been damaged. A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0 indicates that there has never been a scrub run on the pool. The regression caused by the change to zbookmark_t never made it into a tagged release, Gentoo backports, Ubuntu, Debian, Fedora, or EPEL stable respositorys. Only those using the HEAD version directly from Github after the 0.6.2 but before the 0.6.3 tag are affected. This patch does have one limitation that should be mentioned. It will not detect errata #2 on a pool unless errata #1 is also present. It expected this will not be a significant problem because pools impacted by errata #2 have a high probably of being impacted by errata #1. End users can ensure they do no hit this unlikely case by waiting for all asynchronous destroy operations to complete before updating ZoL. The presence of any background destroys on any imported pools can be checked by running `zpool get freeing` as root. This will display a non-zero value for any pool with an active asynchronous destroy. Lastly, it is expected that no user data has been lost as a result of this erratum. Original-patch-by: Tim Chase <[email protected]> Reworked-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
behlendorf
pushed a commit
that referenced
this pull request
Apr 3, 2014
openzfs#180 occurred because of a race between inode eviction and zfs_zget(). openzfs/zfs@36df284 tried to address it by making a call to the VFS to learn whether an inode is being evicted. If it was being evicted the operation was retried after dropping and reacquiring the relevant resources. Unfortunately, this introduced another deadlock. INFO: task kworker/u24:6:891 blocked for more than 120 seconds. Tainted: P O 3.13.6 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u24:6 D ffff88107fcd2e80 0 891 2 0x00000000 Workqueue: writeback bdi_writeback_workfn (flush-zfs-5) ffff8810370ff950 0000000000000002 ffff88103853d940 0000000000012e80 ffff8810370fffd8 0000000000012e80 ffff88103853d940 ffff880f5c8be098 ffff88107ffb6950 ffff8810370ff980 ffff88103a9a5b78 0000000000000000 Call Trace: [<ffffffff813dd1d4>] schedule+0x24/0x70 [<ffffffff8115fc09>] __wait_on_freeing_inode+0x99/0xc0 [<ffffffff8115fdd8>] find_inode_fast+0x78/0xb0 [<ffffffff811608c5>] ilookup+0x65/0xd0 [<ffffffffa035c5ab>] zfs_zget+0xdb/0x260 [zfs] [<ffffffffa03589d6>] zfs_get_data+0x46/0x340 [zfs] [<ffffffffa035fee1>] zil_add_block+0xa31/0xc00 [zfs] [<ffffffffa0360642>] zil_commit+0x12/0x20 [zfs] [<ffffffffa036a6e4>] zpl_putpage+0x174/0x840 [zfs] [<ffffffff811071ec>] do_writepages+0x1c/0x40 [<ffffffff8116df2b>] __writeback_single_inode+0x3b/0x2b0 [<ffffffff8116ecf7>] writeback_sb_inodes+0x247/0x420 [<ffffffff8116f5f3>] wb_writeback+0xe3/0x320 [<ffffffff81170b8e>] bdi_writeback_workfn+0xfe/0x490 [<ffffffff8106072c>] process_one_work+0x16c/0x490 [<ffffffff810613f3>] worker_thread+0x113/0x390 [<ffffffff81066edf>] kthread+0xdf/0x100 This patch implements the original fix in a slightly different manor in order to avoid both deadlocks. Instead of relying on a call to ilookup() which can block in __wait_on_freeing_inode() the return value from igrab() is used. This gives us the information that ilookup() provided without the risk of a deadlock. Alternately, this race could be closed by registering an sops->drop_inode() callback. The callback would need to detect the active SA hold thereby informing the VFS that this inode should not be evicted. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#180
behlendorf
pushed a commit
that referenced
this pull request
Apr 4, 2014
openzfs#180 occurred because of a race between inode eviction and zfs_zget(). openzfs/zfs@36df284 tried to address it by making a call to the VFS to learn whether an inode is being evicted. If it was being evicted the operation was retried after dropping and reacquiring the relevant resources. Unfortunately, this introduced another deadlock. INFO: task kworker/u24:6:891 blocked for more than 120 seconds. Tainted: P O 3.13.6 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u24:6 D ffff88107fcd2e80 0 891 2 0x00000000 Workqueue: writeback bdi_writeback_workfn (flush-zfs-5) ffff8810370ff950 0000000000000002 ffff88103853d940 0000000000012e80 ffff8810370fffd8 0000000000012e80 ffff88103853d940 ffff880f5c8be098 ffff88107ffb6950 ffff8810370ff980 ffff88103a9a5b78 0000000000000000 Call Trace: [<ffffffff813dd1d4>] schedule+0x24/0x70 [<ffffffff8115fc09>] __wait_on_freeing_inode+0x99/0xc0 [<ffffffff8115fdd8>] find_inode_fast+0x78/0xb0 [<ffffffff811608c5>] ilookup+0x65/0xd0 [<ffffffffa035c5ab>] zfs_zget+0xdb/0x260 [zfs] [<ffffffffa03589d6>] zfs_get_data+0x46/0x340 [zfs] [<ffffffffa035fee1>] zil_add_block+0xa31/0xc00 [zfs] [<ffffffffa0360642>] zil_commit+0x12/0x20 [zfs] [<ffffffffa036a6e4>] zpl_putpage+0x174/0x840 [zfs] [<ffffffff811071ec>] do_writepages+0x1c/0x40 [<ffffffff8116df2b>] __writeback_single_inode+0x3b/0x2b0 [<ffffffff8116ecf7>] writeback_sb_inodes+0x247/0x420 [<ffffffff8116f5f3>] wb_writeback+0xe3/0x320 [<ffffffff81170b8e>] bdi_writeback_workfn+0xfe/0x490 [<ffffffff8106072c>] process_one_work+0x16c/0x490 [<ffffffff810613f3>] worker_thread+0x113/0x390 [<ffffffff81066edf>] kthread+0xdf/0x100 This patch implements the original fix in a slightly different manner in order to avoid both deadlocks. Instead of relying on a call to ilookup() which can block in __wait_on_freeing_inode() the return value from igrab() is used. This gives us the information that ilookup() provided without the risk of a deadlock. Alternately, this race could be closed by registering an sops->drop_inode() callback. The callback would need to detect the active SA hold thereby informing the VFS that this inode should not be evicted. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#180
behlendorf
pushed a commit
that referenced
this pull request
May 4, 2015
The params to the functions are uint64_t, but the offsets to memcpy / bcopy are calculated using 32bit ints. This patch changes them to also be uint64_t so there isnt an overflow. PaX's Size Overflow caught this when formatting a zvol. Gentoo bug: #546490 PAX: offset: 1ffffb000 db->db_offset: 1ffffa000 db->db_size: 2000 size: 5000 PAX: size overflow detected in function dmu_read /var/tmp/portage/sys-fs/zfs-kmod-0.6.3-r1/work/zfs-zfs-0.6.3/module/zfs/../../module/zfs/dmu.c:781 cicus.366_146 max, count: 15 CPU: 1 PID: 2236 Comm: zvol/10 Tainted: P O 3.17.7-hardened-r1 #1 Call Trace: [<ffffffffa0382ee8>] ? dsl_dataset_get_holds+0x9d58/0x343ce [zfs] [<ffffffff81a59c88>] dump_stack+0x4e/0x7a [<ffffffffa0393c2a>] ? dsl_dataset_get_holds+0x1aa9a/0x343ce [zfs] [<ffffffff81206696>] report_size_overflow+0x36/0x40 [<ffffffffa02dba2b>] dmu_read+0x52b/0x920 [zfs] [<ffffffffa0373ad1>] zrl_is_locked+0x7d1/0x1ce0 [zfs] [<ffffffffa0364cd2>] zil_clean+0x9d2/0xc00 [zfs] [<ffffffffa0364f21>] zil_commit+0x21/0x30 [zfs] [<ffffffffa0373fe1>] zrl_is_locked+0xce1/0x1ce0 [zfs] [<ffffffff81a5e2c7>] ? __schedule+0x547/0xbc0 [<ffffffffa01582e6>] taskq_cancel_id+0x2a6/0x5b0 [spl] [<ffffffff81103eb0>] ? wake_up_state+0x20/0x20 [<ffffffffa0158150>] ? taskq_cancel_id+0x110/0x5b0 [spl] [<ffffffff810f7ff4>] kthread+0xc4/0xe0 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170 [<ffffffff81a62fa4>] ret_from_fork+0x74/0xa0 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170 Signed-off-by: Jason Zaman <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#3333
behlendorf
added a commit
that referenced
this pull request
Jul 18, 2016
This reverts commit 35a76a0 to resolve the following panic which was introduced. RIP [<ffffffffa0241d11>] fletcher_4_sse2_fini+0x11/0xb0 [zcommon] Pid: 813, comm: modprobe Tainted: P D -- ------------ 2.6.32-642.3.1.el6.x86_64 #1 Call Trace: [<ffffffff81546c21>] ? panic+0xa7/0x179 [<ffffffff8154baa4>] ? oops_end+0xe4/0x100 [<ffffffff8101102b>] ? die+0x5b/0x90 [<ffffffff8154b572>] ? do_general_protection+0x152/0x160 [<ffffffff8100c4ae>] ? xen_hvm_callback_vector+0xe/0x20 [<ffffffff8154ace5>] ? general_protection+0x25/0x30 [<ffffffffa0241d00>] ? fletcher_4_sse2_fini+0x0/0xb0 [zcommon] [<ffffffffa0241d11>] ? fletcher_4_sse2_fini+0x11/0xb0 [zcommon] [<ffffffff81007bc1>] ? xen_clocksource_read+0x21/0x30 [<ffffffff81007ca9>] ? xen_clocksource_get_cycles+0x9/0x10 [<ffffffff810b1b65>] ? getrawmonotonic+0x35/0xc0 [<ffffffffa0241077>] ? fletcher_4_init+0x227/0x260 [zcommon] [<ffffffff812a8490>] ? kvasprintf+0x70/0x90 [<ffffffffa024d000>] ? zcommon_init+0x0/0xd [zcommon] [<ffffffffa024d009>] ? zcommon_init+0x9/0xd [zcommon] [<ffffffff810020d0>] ? do_one_initcall+0xc0/0x280 [<ffffffff810c8371>] ? sys_init_module+0xe1/0x250 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b Disable zimport testing against master where this flaw exists: TEST_ZIMPORT_VERSIONS="installed" Signed-off-by: Brian Behlendorf <[email protected]>
behlendorf
pushed a commit
that referenced
this pull request
Jul 28, 2016
DMU_MAX_ACCESS should be cast to a uint64_t otherwise the multiplication of DMU_MAX_ACCESS with spa_asize_inflation will be 32 bit and may lead to an overflow. Currently DMU_MAX_ACCESS is 64 * 1024 * 1024, so spa_asize_inflation being 64 or more will lead to an overflow. Found by static analysis with CoverityScan 0.8.5 CID 150942 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN) overflow_before_widen: Potentially overflowing expression 67108864 * spa_asize_inflation with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type uint64_t (64 bits, unsigned). Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4889
behlendorf
pushed a commit
that referenced
this pull request
Jul 30, 2016
Leaks reported by using AddressSanitizer, GCC 6.1.0 Direct leak of 4097 byte(s) in 1 object(s) allocated from: #1 0x414f73 in process_options cmd/ztest/ztest.c:721 Direct leak of 5440 byte(s) in 17 object(s) allocated from: #1 0x41bfd5 in umem_alloc ../../lib/libspl/include/umem.h:88 #2 0x41bfd5 in ztest_zap_parallel cmd/ztest/ztest.c:4659 #3 0x4163a8 in ztest_execute cmd/ztest/ztest.c:5907 Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4896
behlendorf
pushed a commit
that referenced
this pull request
May 21, 2018
It is just plain unsafe to peek inside in-kernel mutex structure and make assumptions about what kernel does with those internal fields like owner. Kernel is all too happy to stop doing the expected things like tracing lock owner once you load a tainted module like spl/zfs that is not GPL. As such you will get instant assertion failures like this: VERIFY3(((*(volatile typeof((&((&zo->zo_lock)->m_mutex))->owner) *)& ((&((&zo->zo_lock)->m_mutex))->owner))) == ((void *)0)) failed (ffff88030be28500 == (null)) PANIC at zfs_onexit.c:104:zfs_onexit_destroy() Showing stack for process 3626 CPU: 0 PID: 3626 Comm: mkfs.lustre Tainted: P OE ------------ 3.10.0-debug #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x19/0x1b spl_dumpstack+0x44/0x50 [spl] spl_panic+0xbf/0xf0 [spl] zfs_onexit_destroy+0x17c/0x280 [zfs] zfsdev_release+0x48/0xd0 [zfs] Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Closes openzfs#632 Closes openzfs#633
behlendorf
pushed a commit
that referenced
this pull request
May 21, 2018
It is just plain unsafe to peek inside in-kernel mutex structure and make assumptions about what kernel does with those internal fields like owner. Kernel is all too happy to stop doing the expected things like tracing lock owner once you load a tainted module like spl/zfs that is not GPL. As such you will get instant assertion failures like this: VERIFY3(((*(volatile typeof((&((&zo->zo_lock)->m_mutex))->owner) *)& ((&((&zo->zo_lock)->m_mutex))->owner))) == ((void *)0)) failed (ffff88030be28500 == (null)) PANIC at zfs_onexit.c:104:zfs_onexit_destroy() Showing stack for process 3626 CPU: 0 PID: 3626 Comm: mkfs.lustre Tainted: P OE ------------ 3.10.0-debug #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x19/0x1b spl_dumpstack+0x44/0x50 [spl] spl_panic+0xbf/0xf0 [spl] zfs_onexit_destroy+0x17c/0x280 [zfs] zfsdev_release+0x48/0xd0 [zfs] Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Gvozden Neskovic <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Closes openzfs#639 Closes openzfs#632
behlendorf
pushed a commit
that referenced
this pull request
Oct 17, 2018
The bug time sequence: 1. thread #1, `zfs_write` assign a txg "n". 2. In a same process, thread #2, mmap page fault (which means the `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed, and wait previous txg "n" completed. 3. thread #1 call `uiomove` to write, however page fault is occurred in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by thread #2, so it stuck and can't complete, then txg "n" will not complete. So thread #1 and thread #2 are deadlocked. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Grady Wong <[email protected]> Closes openzfs#7939
behlendorf
pushed a commit
that referenced
this pull request
Feb 21, 2019
Trying to mount a dataset from a readonly pool could inadvertently start the user accounting upgrade task, leading to the following failure: VERIFY3(tx->tx_threads == 2) failed (0 == 2) PANIC at txg.c:680:txg_wait_synced() Showing stack for process 2541 CPU: 2 PID: 2541 Comm: z_upgrade Tainted: P O 3.16.0-4-amd64 #1 Debian 3.16.51-3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: [<0>] ? dump_stack+0x5d/0x78 [<0>] ? spl_panic+0xc9/0x110 [spl] [<0>] ? dnode_next_offset+0x1d4/0x2c0 [zfs] [<0>] ? dmu_object_next+0x77/0x130 [zfs] [<0>] ? dnode_rele_and_unlock+0x4d/0x120 [zfs] [<0>] ? txg_wait_synced+0x91/0x220 [zfs] [<0>] ? dmu_objset_id_quota_upgrade_cb+0x10f/0x140 [zfs] [<0>] ? dmu_objset_upgrade_task_cb+0xe3/0x170 [zfs] [<0>] ? taskq_thread+0x2cc/0x5d0 [spl] [<0>] ? wake_up_state+0x10/0x10 [<0>] ? taskq_thread_should_stop.part.3+0x70/0x70 [spl] [<0>] ? kthread+0xbd/0xe0 [<0>] ? kthread_create_on_node+0x180/0x180 [<0>] ? ret_from_fork+0x58/0x90 [<0>] ? kthread_create_on_node+0x180/0x180 This patch updates both functions responsible for checking if we can perform user accounting to verify the pool is not readonly. Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes openzfs#8424
behlendorf
pushed a commit
that referenced
this pull request
Feb 23, 2019
While ZFS allow renaming of in use ZVOLs at the DSL level without issues the ZVOL layer does not correctly update the renamed dataset if the device node is open (zv->zv_open_count > 0): trying to access the stale dataset name, for instance during a zfs receive, will cause the following failure: VERIFY3(zv->zv_objset->os_dsl_dataset->ds_owner == zv) failed ((null) == ffff8800dbb6fc00) PANIC at zvol.c:1255:zvol_resume() Showing stack for process 1390 CPU: 0 PID: 1390 Comm: zfs Tainted: P O 3.16.0-4-amd64 #1 Debian 3.16.51-3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 ffffffff8151ea00 ffffffffa0758a80 ffff88028aefba30 ffffffffa0417219 ffff880037179220 ffffffff00000030 ffff88028aefba40 ffff88028aefb9e0 2833594649524556 6f5f767a3e2d767a 6f3e2d7465736a62 Call Trace: [<0>] ? dump_stack+0x5d/0x78 [<0>] ? spl_panic+0xc9/0x110 [spl] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? rrw_exit+0xc8/0x2e0 [zfs] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? dmu_objset_from_ds+0x9a/0x250 [zfs] [<0>] ? dmu_objset_hold_flags+0x71/0xc0 [zfs] [<0>] ? zvol_resume+0x178/0x280 [zfs] [<0>] ? zfs_ioc_recv_impl+0x88b/0xf80 [zfs] [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? zfs_ioc_recv+0x1c2/0x2a0 [zfs] [<0>] ? dmu_buf_get_user+0x13/0x20 [zfs] [<0>] ? __alloc_pages_nodemask+0x166/0xb50 [<0>] ? zfsdev_ioctl+0x896/0x9c0 [zfs] [<0>] ? handle_mm_fault+0x464/0x1140 [<0>] ? do_vfs_ioctl+0x2cf/0x4b0 [<0>] ? __do_page_fault+0x177/0x410 [<0>] ? SyS_ioctl+0x81/0xa0 [<0>] ? async_page_fault+0x28/0x30 [<0>] ? system_call_fast_compare_end+0x10/0x15 Reviewed by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes openzfs#6263 Closes openzfs#8371
behlendorf
pushed a commit
that referenced
this pull request
Mar 12, 2019
Booting debug kernel found an inconsistent lock dependency between dataset's ds_lock and its directory's dd_lock. [ 32.215336] ====================================================== [ 32.221859] WARNING: possible circular locking dependency detected [ 32.221861] 4.14.90+ openzfs#8 Tainted: G O [ 32.221862] ------------------------------------------------------ [ 32.221863] dynamic_kernel_/4667 is trying to acquire lock: [ 32.221864] (&ds->ds_lock){+.+.}, at: [<ffffffffc10a4bde>] dsl_dataset_check_quota+0x9e/0x8a0 [zfs] [ 32.221941] but task is already holding lock: [ 32.221941] (&dd->dd_lock){+.+.}, at: [<ffffffffc10cd8e9>] dsl_dir_tempreserve_space+0x3b9/0x1290 [zfs] [ 32.221983] which lock already depends on the new lock. [ 32.221983] the existing dependency chain (in reverse order) is: [ 32.221984] -> #1 (&dd->dd_lock){+.+.}: [ 32.221992] __mutex_lock+0xef/0x14c0 [ 32.222049] dsl_dir_namelen+0xd4/0x2d0 [zfs] [ 32.222093] dsl_dataset_namelen+0x2f1/0x430 [zfs] [ 32.222142] verify_dataset_name_len+0xd/0x40 [zfs] [ 32.222184] dmu_objset_find_dp_impl+0x5f5/0xef0 [zfs] [ 32.222226] dmu_objset_find_dp_cb+0x40/0x60 [zfs] [ 32.222235] taskq_thread+0x969/0x1460 [spl] [ 32.222238] kthread+0x2fb/0x400 [ 32.222241] ret_from_fork+0x3a/0x50 [ 32.222241] -> #0 (&ds->ds_lock){+.+.}: [ 32.222246] lock_acquire+0x14f/0x390 [ 32.222248] __mutex_lock+0xef/0x14c0 [ 32.222291] dsl_dataset_check_quota+0x9e/0x8a0 [zfs] [ 32.222355] dsl_dir_tempreserve_space+0x5d2/0x1290 [zfs] [ 32.222392] dmu_tx_assign+0xa61/0xdb0 [zfs] [ 32.222436] zfs_create+0x4e6/0x11d0 [zfs] [ 32.222481] zpl_create+0x194/0x340 [zfs] [ 32.222484] lookup_open+0xa86/0x16f0 [ 32.222486] path_openat+0xe56/0x2490 [ 32.222488] do_filp_open+0x17f/0x260 [ 32.222490] do_sys_open+0x195/0x310 [ 32.222491] SyS_open+0xbf/0xf0 [ 32.222494] do_syscall_64+0x191/0x4f0 [ 32.222496] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 32.222497] other info that might help us debug this: [ 32.222497] Possible unsafe locking scenario: [ 32.222498] CPU0 CPU1 [ 32.222498] ---- ---- [ 32.222499] lock(&dd->dd_lock); [ 32.222500] lock(&ds->ds_lock); [ 32.222502] lock(&dd->dd_lock); [ 32.222503] lock(&ds->ds_lock); [ 32.222504] *** DEADLOCK *** [ 32.222505] 3 locks held by dynamic_kernel_/4667: [ 32.222506] #0: (sb_writers#9){.+.+}, at: [<ffffffffaf68933c>] mnt_want_write+0x3c/0xa0 [ 32.222511] #1: (&type->i_mutex_dir_key#8){++++}, at: [<ffffffffaf652cde>] path_openat+0xe2e/0x2490 [ 32.222515] #2: (&dd->dd_lock){+.+.}, at: [<ffffffffc10cd8e9>] dsl_dir_tempreserve_space+0x3b9/0x1290 [zfs] The issue is caused by dsl_dataset_namelen() holding ds_lock, followed by acquiring dd_lock on ds->ds_dir in dsl_dir_namelen(). However, ds->ds_dir should not be protected by ds_lock, so releasing it before call to dsl_dir_namelen() prevents the lockdep issue Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chris Dunlop <[email protected]> Signed-off-by: Michael Zhivich <[email protected]> Closes openzfs#8413
behlendorf
pushed a commit
that referenced
this pull request
Jul 25, 2019
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jeff Dike <[email protected]> Closes openzfs#8984
behlendorf
pushed a commit
that referenced
this pull request
Oct 4, 2019
lockdep reports a possible recursive lock in dbuf_destroy. It is true that dbuf_destroy is acquiring the dn_dbufs_mtx on one dnode while holding it on another dnode. However, it is impossible for these to be the same dnode because, among other things,dbuf_destroy checks MUTEX_HELD before acquiring the mutex. This fix defines a class NESTED_SINGLE == 1 and changes that lock to call mutex_enter_nested with a subclass of NESTED_SINGLE. In order to make the userspace code compile, include/sys/zfs_context.h now defines mutex_enter_nested and NESTED_SINGLE. This is the lockdep report: [ 122.950921] ============================================ [ 122.950921] WARNING: possible recursive locking detected [ 122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G O [ 122.950921] -------------------------------------------- [ 122.950921] dbu_evict/1457 is trying to acquire lock: [ 122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] but task is already holding lock: [ 122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] other info that might help us debug this: [ 122.950921] Possible unsafe locking scenario: [ 122.950921] CPU0 [ 122.950921] ---- [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] lock(&dn->dn_dbufs_mtx); [ 122.950921] *** DEADLOCK *** [ 122.950921] May be due to missing lock nesting notation [ 122.950921] 1 lock held by dbu_evict/1457: [ 122.950921] #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs] [ 122.950921] stack backtrace: [ 122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G O 4.19.29-4.19.0-debug-d69edad5368c1166 #1 [ 122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011 03/13/2009 [ 122.950921] Call Trace: [ 122.950921] dump_stack+0x91/0xeb [ 122.950921] __lock_acquire+0x2ca7/0x4f10 [ 122.950921] lock_acquire+0x153/0x330 [ 122.950921] dbuf_destroy+0x3c0/0xdb0 [zfs] [ 122.950921] dbuf_evict_one+0x1cc/0x3d0 [zfs] [ 122.950921] dbuf_rele_and_unlock+0xb84/0xd60 [zfs] [ 122.950921] dnode_evict_dbufs+0x3a6/0x740 [zfs] [ 122.950921] dmu_objset_evict+0x7a/0x500 [zfs] [ 122.950921] dsl_dataset_evict_async+0x70/0x480 [zfs] [ 122.950921] taskq_thread+0x979/0x1480 [spl] [ 122.950921] kthread+0x2e7/0x3e0 [ 122.950921] ret_from_fork+0x27/0x50 Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jeff Dike <[email protected]> Closes openzfs#8984
behlendorf
pushed a commit
that referenced
this pull request
Dec 14, 2019
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread openzfs#7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#7 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#8 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#9706
behlendorf
pushed a commit
that referenced
this pull request
Feb 28, 2020
After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread openzfs#7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#7 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#8 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#9706
behlendorf
pushed a commit
that referenced
this pull request
Jul 27, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Aug 23, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Aug 24, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Aug 24, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Aug 24, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Sep 15, 2021
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes openzfs#12339
behlendorf
pushed a commit
that referenced
this pull request
Oct 21, 2022
Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we leave zfsvfs != NULL. This will lead to execution of the error handling `if` statement at the `out` label, and hence to a call to dmu_objset_disown and zfsvfs_free. However, zfs_umount, which we call upon failure of zfs_root and d_make_root already does dmu_objset_disown and zfsvfs_free. I suppose this patch rather adds to the brittleness of this part of the code base, but I don't want to invest more time in this right now. To add a regression test, we'd need some kind of fault injection facility for zfs_root or d_make_root, which doesn't exist right now. And even then, I think that regression test would be too closely tied to the implementation. To repro the double-disown / double-free, do the following: 1. patch zfs_root to always return an error 2. mount a ZFS filesystem Here's the stack trace you would see then: VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000) PANIC at dsl_dataset.c:1003:dsl_dataset_disown() Showing stack for process 28332 CPU: 2 PID: 28332 Comm: zpool Tainted: G O 5.10.103-1.nutanix.el7.x86_64 #1 Call Trace: dump_stack+0x74/0x92 spl_dumpstack+0x29/0x2b [spl] spl_panic+0xd4/0xfc [spl] dsl_dataset_disown+0xe9/0x150 [zfs] dmu_objset_disown+0xd6/0x150 [zfs] zfs_domount+0x17b/0x4b0 [zfs] zpl_mount+0x174/0x220 [zfs] legacy_get_tree+0x2b/0x50 vfs_get_tree+0x2a/0xc0 path_mount+0x2fa/0xa70 do_mount+0x7c/0xa0 __x64_sys_mount+0x8b/0xe0 do_syscall_64+0x38/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Co-authored-by: Christian Schwarz <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes openzfs#14025
behlendorf
pushed a commit
that referenced
this pull request
Nov 11, 2022
Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we leave zfsvfs != NULL. This will lead to execution of the error handling `if` statement at the `out` label, and hence to a call to dmu_objset_disown and zfsvfs_free. However, zfs_umount, which we call upon failure of zfs_root and d_make_root already does dmu_objset_disown and zfsvfs_free. I suppose this patch rather adds to the brittleness of this part of the code base, but I don't want to invest more time in this right now. To add a regression test, we'd need some kind of fault injection facility for zfs_root or d_make_root, which doesn't exist right now. And even then, I think that regression test would be too closely tied to the implementation. To repro the double-disown / double-free, do the following: 1. patch zfs_root to always return an error 2. mount a ZFS filesystem Here's the stack trace you would see then: VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000) PANIC at dsl_dataset.c:1003:dsl_dataset_disown() Showing stack for process 28332 CPU: 2 PID: 28332 Comm: zpool Tainted: G O 5.10.103-1.nutanix.el7.x86_64 #1 Call Trace: dump_stack+0x74/0x92 spl_dumpstack+0x29/0x2b [spl] spl_panic+0xd4/0xfc [spl] dsl_dataset_disown+0xe9/0x150 [zfs] dmu_objset_disown+0xd6/0x150 [zfs] zfs_domount+0x17b/0x4b0 [zfs] zpl_mount+0x174/0x220 [zfs] legacy_get_tree+0x2b/0x50 vfs_get_tree+0x2a/0xc0 path_mount+0x2fa/0xa70 do_mount+0x7c/0xa0 __x64_sys_mount+0x8b/0xe0 do_syscall_64+0x38/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Co-authored-by: Christian Schwarz <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes openzfs#14025
behlendorf
pushed a commit
that referenced
this pull request
Jan 30, 2023
After doing some tracing on mmap using the O_DIRECT code, I decided to test out the mmap test case just on master to see if there existed a race condition that is outside of O_DIRECT. There does exist a race condition which is shown in the trace below. We are hitting the same ASSERT that the page should be up to date in mappedread(). The only difference in the test case that is run on the O_DIRECT branch is it use `direct=1` for fio. However, I changed this to `direct=0` to make sure we are always going down the buffered path. The reason I did this my O_DIRECT stack traces were showing we only hit this in the first write of O_DIRECT while growing the file (that is always sent to the ARC). [12039.760554] VERIFY(PageUptodate(pp)) failed [12039.764745] PANIC at zfs_vnops_os.c:298:mappedread() [12039.769712] Showing stack for process 1125719 [12039.774071] CPU: 19 PID: 1125719 Comm: fio Kdump: loaded Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12039.785193] Hardware name: GIGABYTE R272-Z32-00/MZ32-AR0-00, BIOS R21 10/08/2020 [12039.792577] Call Trace: [12039.795024] dump_stack+0x41/0x60 [12039.798343] spl_panic+0xd0/0xe8 [spl] [12039.802103] ? _cond_resched+0x15/0x30 [12039.805854] ? mutex_lock+0xe/0x30 [12039.809253] ? arc_cksum_compute+0xcb/0x180 [zfs] [12039.814129] ? __raw_spin_unlock+0x5/0x10 [zfs] [12039.818819] ? dmu_read_uio_dnode+0xf1/0x130 [zfs] [12039.823802] ? kfree+0xd3/0x250 [12039.826948] ? xas_load+0x8/0x80 [12039.830182] ? find_get_entry+0xd6/0x1c0 [12039.834106] ? _cond_resched+0x15/0x30 [12039.837852] spl_assert+0x17/0x20 [zfs] [12039.841885] mappedread+0x136/0x140 [zfs] [12039.846077] zfs_read+0x165/0x2e0 [zfs] [12039.850100] zpl_iter_read+0xa8/0x110 [zfs] [12039.854456] ? __handle_mm_fault+0x44f/0x6c0 [12039.858729] new_sync_read+0x10f/0x150 [12039.862484] vfs_read+0x91/0x140 [12039.865716] ksys_read+0x4f/0xb0 [12039.868946] do_syscall_64+0x5b/0x1a0 [12039.872614] entry_SYSCALL_64_after_hwframe+0x65/0xca [12039.877668] RIP: 0033:0x7fa34855cab4 [12039.881246] Code: c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 7b fc ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 48 89 44 24 08 e8 b7 fc ff ff 48 [12039.899990] RSP: 002b:00007ffc7a17caf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [12039.907550] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa34855cab4 [12039.914680] RDX: 0000000000100000 RSI: 00007fa2ddc99010 RDI: 0000000000000005 [12039.921804] RBP: 00007fa2ddc99010 R08: 0000000000000000 R09: 0000000000000000 [12039.928928] R10: 0000000020f2cfd0 R11: 0000000000000246 R12: 0000000000100000 [12039.936052] R13: 000055f6f3432ec0 R14: 0000000000100000 R15: 000055f6f3432ee8 [12166.129858] INFO: task fio:1125717 blocked for more than 120 seconds. [12166.136307] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12166.144043] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12166.151870] task:fio state:D stack: 0 pid:1125717 ppid:1125444 flags:0x00000080 [12166.160478] Call Trace: [12166.162930] __schedule+0x2d1/0x830 [12166.166433] schedule+0x35/0xa0 [12166.169585] cv_wait_common+0x153/0x240 [spl] [12166.173954] ? finish_wait+0x80/0x80 [12166.177539] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12166.183054] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12166.188461] zfs_write+0x615/0xd70 [zfs] [12166.192570] zpl_iter_write+0xe0/0x120 [zfs] [12166.197032] ? __handle_mm_fault+0x44f/0x6c0 [12166.201303] new_sync_write+0x112/0x160 [12166.205146] vfs_write+0xa5/0x1a0 [12166.208473] ksys_write+0x4f/0xb0 [12166.211793] do_syscall_64+0x5b/0x1a0 [12166.215465] entry_SYSCALL_64_after_hwframe+0x65/0xca [12166.220520] RIP: 0033:0x7fcb60d50a17 [12166.224103] Code: Unable to access opcode bytes at RIP 0x7fcb60d509ed. [12166.230633] RSP: 002b:00007ffce0bbd250 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [12166.238199] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fcb60d50a17 [12166.245331] RDX: 0000000000100000 RSI: 00007fcaf648d010 RDI: 0000000000000005 [12166.252463] RBP: 00007fcaf648d010 R08: 0000000000000000 R09: 0000000000000000 [12166.259598] R10: 000055aeb180bf80 R11: 0000000000000293 R12: 0000000000100000 [12166.266729] R13: 000055aeb1803ec0 R14: 0000000000100000 R15: 000055aeb1803ee8 [12166.273873] INFO: task fio:1125719 blocked for more than 120 seconds. [12166.280317] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12166.288057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12166.295882] task:fio state:D stack: 0 pid:1125719 ppid:1125447 flags:0x80004080 [12166.304489] Call Trace: [12166.306942] __schedule+0x2d1/0x830 [12166.310435] schedule+0x35/0xa0 [12166.313580] spl_panic+0xe6/0xe8 [spl] [12166.317341] ? _cond_resched+0x15/0x30 [12166.321094] ? mutex_lock+0xe/0x30 [12166.324501] ? arc_cksum_compute+0xcb/0x180 [zfs] [12166.329379] ? __raw_spin_unlock+0x5/0x10 [zfs] [12166.334078] ? dmu_read_uio_dnode+0xf1/0x130 [zfs] [12166.339042] ? kfree+0xd3/0x250 [12166.342187] ? xas_load+0x8/0x80 [12166.345421] ? find_get_entry+0xd6/0x1c0 [12166.349346] ? _cond_resched+0x15/0x30 [12166.353102] spl_assert+0x17/0x20 [zfs] [12166.357130] mappedread+0x136/0x140 [zfs] [12166.361324] zfs_read+0x165/0x2e0 [zfs] [12166.365347] zpl_iter_read+0xa8/0x110 [zfs] [12166.369706] ? __handle_mm_fault+0x44f/0x6c0 [12166.373978] new_sync_read+0x10f/0x150 [12166.377731] vfs_read+0x91/0x140 [12166.380964] ksys_read+0x4f/0xb0 [12166.384196] do_syscall_64+0x5b/0x1a0 [12166.387863] entry_SYSCALL_64_after_hwframe+0x65/0xca [12166.392913] RIP: 0033:0x7fa34855cab4 [12166.396494] Code: Unable to access opcode bytes at RIP 0x7fa34855ca8a. [12166.403018] RSP: 002b:00007ffc7a17caf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [12166.410586] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa34855cab4 [12166.417717] RDX: 0000000000100000 RSI: 00007fa2ddc99010 RDI: 0000000000000005 [12166.424852] RBP: 00007fa2ddc99010 R08: 0000000000000000 R09: 0000000000000000 [12166.431985] R10: 0000000020f2cfd0 R11: 0000000000000246 R12: 0000000000100000 [12166.439117] R13: 000055f6f3432ec0 R14: 0000000000100000 R15: 000055f6f3432ee8 [12289.008759] INFO: task kworker/u128:1:32743 blocked for more than 120 seconds. [12289.015983] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12289.023722] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12289.031550] task:kworker/u128:1 state:D stack: 0 pid:32743 ppid: 2 flags:0x80004080 [12289.039905] Workqueue: writeback wb_workfn (flush-zfs-915) [12289.045397] Call Trace: [12289.047850] __schedule+0x2d1/0x830 [12289.051343] schedule+0x35/0xa0 [12289.054491] cv_wait_common+0x153/0x240 [spl] [12289.058859] ? finish_wait+0x80/0x80 [12289.062435] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12289.067949] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12289.073357] zfs_putpage+0x13b/0x5b0 [zfs] [12289.077640] ? pmdp_collapse_flush+0x10/0x10 [12289.081909] ? rmap_walk_file+0x116/0x290 [12289.085922] ? __mod_memcg_lruvec_state+0x5d/0x160 [12289.090715] zpl_putpage+0x67/0xd0 [zfs] [12289.094817] write_cache_pages+0x197/0x420 [12289.098916] ? zpl_readpage_filler+0x10/0x10 [zfs] [12289.103890] zpl_writepages+0x98/0x130 [zfs] [12289.108335] do_writepages+0xc2/0x1c0 [12289.112000] ? __wb_calc_thresh+0x3a/0x120 [12289.116100] __writeback_single_inode+0x39/0x2f0 [12289.120719] writeback_sb_inodes+0x1e6/0x450 [12289.124993] __writeback_inodes_wb+0x5f/0xc0 [12289.129263] wb_writeback+0x247/0x2e0 [12289.132931] wb_workfn+0x346/0x4d0 [12289.136337] ? __switch_to_asm+0x35/0x70 [12289.140262] ? __switch_to_asm+0x41/0x70 [12289.144189] ? __switch_to_asm+0x35/0x70 [12289.148116] ? __switch_to_asm+0x41/0x70 [12289.152042] ? __switch_to_asm+0x35/0x70 [12289.155964] ? __switch_to_asm+0x41/0x70 [12289.159893] ? __switch_to_asm+0x35/0x70 [12289.163819] ? __switch_to_asm+0x41/0x70 [12289.167748] process_one_work+0x1a7/0x360 [12289.171766] ? create_worker+0x1a0/0x1a0 [12289.175690] worker_thread+0x30/0x390 [12289.179357] ? create_worker+0x1a0/0x1a0 [12289.183284] kthread+0x10a/0x120 [12289.186514] ? set_kthread_struct+0x40/0x40 [12289.190702] ret_from_fork+0x35/0x40 [12289.194323] INFO: task fio:1125717 blocked for more than 120 seconds. [12289.200766] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12289.208504] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12289.216331] task:fio state:D stack: 0 pid:1125717 ppid:1125444 flags:0x00000080 [12289.224935] Call Trace: [12289.227390] __schedule+0x2d1/0x830 [12289.230882] schedule+0x35/0xa0 [12289.234026] cv_wait_common+0x153/0x240 [spl] [12289.238396] ? finish_wait+0x80/0x80 [12289.241975] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12289.247471] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12289.252874] zfs_write+0x615/0xd70 [zfs] [12289.256979] zpl_iter_write+0xe0/0x120 [zfs] [12289.261429] ? __handle_mm_fault+0x44f/0x6c0 [12289.265708] new_sync_write+0x112/0x160 [12289.269553] vfs_write+0xa5/0x1a0 [12289.272869] ksys_write+0x4f/0xb0 [12289.276191] do_syscall_64+0x5b/0x1a0 [12289.279856] entry_SYSCALL_64_after_hwframe+0x65/0xca [12289.284908] RIP: 0033:0x7fcb60d50a17 [12289.288493] Code: Unable to access opcode bytes at RIP 0x7fcb60d509ed. [12289.295022] RSP: 002b:00007ffce0bbd250 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [12289.302590] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fcb60d50a17 [12289.309723] RDX: 0000000000100000 RSI: 00007fcaf648d010 RDI: 0000000000000005 [12289.316853] RBP: 00007fcaf648d010 R08: 0000000000000000 R09: 0000000000000000 [12289.323987] R10: 000055aeb180bf80 R11: 0000000000000293 R12: 0000000000100000 [12289.331121] R13: 000055aeb1803ec0 R14: 0000000000100000 R15: 000055aeb1803ee8 [12289.338253] INFO: task fio:1125719 blocked for more than 120 seconds. [12289.344693] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12289.352431] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12289.360256] task:fio state:D stack: 0 pid:1125719 ppid:1125447 flags:0x80004080 [12289.368864] Call Trace: [12289.371316] __schedule+0x2d1/0x830 [12289.374808] schedule+0x35/0xa0 [12289.377953] spl_panic+0xe6/0xe8 [spl] [12289.381717] ? _cond_resched+0x15/0x30 [12289.385467] ? mutex_lock+0xe/0x30 [12289.388873] ? arc_cksum_compute+0xcb/0x180 [zfs] [12289.393756] ? __raw_spin_unlock+0x5/0x10 [zfs] [12289.398451] ? dmu_read_uio_dnode+0xf1/0x130 [zfs] [12289.403423] ? kfree+0xd3/0x250 [12289.406571] ? xas_load+0x8/0x80 [12289.409806] ? find_get_entry+0xd6/0x1c0 [12289.413732] ? _cond_resched+0x15/0x30 [12289.417484] spl_assert+0x17/0x20 [zfs] [12289.421511] mappedread+0x136/0x140 [zfs] [12289.425709] zfs_read+0x165/0x2e0 [zfs] [12289.429731] zpl_iter_read+0xa8/0x110 [zfs] [12289.434097] ? __handle_mm_fault+0x44f/0x6c0 [12289.438369] new_sync_read+0x10f/0x150 [12289.442121] vfs_read+0x91/0x140 [12289.445353] ksys_read+0x4f/0xb0 [12289.448588] do_syscall_64+0x5b/0x1a0 [12289.452254] entry_SYSCALL_64_after_hwframe+0x65/0xca [12289.457307] RIP: 0033:0x7fa34855cab4 [12289.460885] Code: Unable to access opcode bytes at RIP 0x7fa34855ca8a. [12289.467411] RSP: 002b:00007ffc7a17caf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [12289.474977] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa34855cab4 [12289.482110] RDX: 0000000000100000 RSI: 00007fa2ddc99010 RDI: 0000000000000005 [12289.489241] RBP: 00007fa2ddc99010 R08: 0000000000000000 R09: 0000000000000000 [12289.496376] R10: 0000000020f2cfd0 R11: 0000000000000246 R12: 0000000000100000 [12289.503506] R13: 000055f6f3432ec0 R14: 0000000000100000 R15: 000055f6f3432ee8 [12411.887657] INFO: task kworker/u128:1:32743 blocked for more than 120 seconds. [12411.894885] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12411.902627] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12411.910450] task:kworker/u128:1 state:D stack: 0 pid:32743 ppid: 2 flags:0x80004080 [12411.918798] Workqueue: writeback wb_workfn (flush-zfs-915) [12411.924292] Call Trace: [12411.926747] __schedule+0x2d1/0x830 [12411.930247] schedule+0x35/0xa0 [12411.933394] cv_wait_common+0x153/0x240 [spl] [12411.937758] ? finish_wait+0x80/0x80 [12411.941339] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12411.946852] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12411.952260] zfs_putpage+0x13b/0x5b0 [zfs] [12411.956541] ? pmdp_collapse_flush+0x10/0x10 [12411.960821] ? rmap_walk_file+0x116/0x290 [12411.964833] ? __mod_memcg_lruvec_state+0x5d/0x160 [12411.969627] zpl_putpage+0x67/0xd0 [zfs] [12411.973734] write_cache_pages+0x197/0x420 [12411.977835] ? zpl_readpage_filler+0x10/0x10 [zfs] [12411.982802] zpl_writepages+0x98/0x130 [zfs] [12411.987246] do_writepages+0xc2/0x1c0 [12411.990910] ? __wb_calc_thresh+0x3a/0x120 [12411.995013] __writeback_single_inode+0x39/0x2f0 [12411.999632] writeback_sb_inodes+0x1e6/0x450 [12412.003904] __writeback_inodes_wb+0x5f/0xc0 [12412.008176] wb_writeback+0x247/0x2e0 [12412.011841] wb_workfn+0x346/0x4d0 [12412.015248] ? __switch_to_asm+0x35/0x70 [12412.019175] ? __switch_to_asm+0x41/0x70 [12412.023098] ? __switch_to_asm+0x35/0x70 [12412.027027] ? __switch_to_asm+0x41/0x70 [12412.030952] ? __switch_to_asm+0x35/0x70 [12412.034879] ? __switch_to_asm+0x41/0x70 [12412.038805] ? __switch_to_asm+0x35/0x70 [12412.042731] ? __switch_to_asm+0x41/0x70 [12412.046656] process_one_work+0x1a7/0x360 [12412.050671] ? create_worker+0x1a0/0x1a0 [12412.054594] worker_thread+0x30/0x390 [12412.058260] ? create_worker+0x1a0/0x1a0 [12412.062186] kthread+0x10a/0x120 [12412.065421] ? set_kthread_struct+0x40/0x40 [12412.069607] ret_from_fork+0x35/0x40 [12412.073227] INFO: task fio:1125717 blocked for more than 120 seconds. [12412.079667] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12412.087406] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12412.095233] task:fio state:D stack: 0 pid:1125717 ppid:1125444 flags:0x00000080 [12412.103838] Call Trace: [12412.106291] __schedule+0x2d1/0x830 [12412.109785] schedule+0x35/0xa0 [12412.112930] cv_wait_common+0x153/0x240 [spl] [12412.117299] ? finish_wait+0x80/0x80 [12412.120877] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12412.126373] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12412.131778] zfs_write+0x615/0xd70 [zfs] [12412.135881] zpl_iter_write+0xe0/0x120 [zfs] [12412.140336] ? __handle_mm_fault+0x44f/0x6c0 [12412.144607] new_sync_write+0x112/0x160 [12412.148446] vfs_write+0xa5/0x1a0 [12412.151765] ksys_write+0x4f/0xb0 [12412.155085] do_syscall_64+0x5b/0x1a0 [12412.158753] entry_SYSCALL_64_after_hwframe+0x65/0xca [12412.163811] RIP: 0033:0x7fcb60d50a17 [12412.167395] Code: Unable to access opcode bytes at RIP 0x7fcb60d509ed. [12412.173916] RSP: 002b:00007ffce0bbd250 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [12412.181481] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fcb60d50a17 [12412.188614] RDX: 0000000000100000 RSI: 00007fcaf648d010 RDI: 0000000000000005 [12412.195749] RBP: 00007fcaf648d010 R08: 0000000000000000 R09: 0000000000000000 [12412.202882] R10: 000055aeb180bf80 R11: 0000000000000293 R12: 0000000000100000 [12412.210014] R13: 000055aeb1803ec0 R14: 0000000000100000 R15: 000055aeb1803ee8 [12412.217146] INFO: task fio:1125719 blocked for more than 120 seconds. [12412.223587] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12412.231325] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12412.239150] task:fio state:D stack: 0 pid:1125719 ppid:1125447 flags:0x80004080 [12412.247755] Call Trace: [12412.250213] __schedule+0x2d1/0x830 [12412.253713] schedule+0x35/0xa0 [12412.256867] spl_panic+0xe6/0xe8 [spl] [12412.260639] ? _cond_resched+0x15/0x30 [12412.264398] ? mutex_lock+0xe/0x30 [12412.267803] ? arc_cksum_compute+0xcb/0x180 [zfs] [12412.272704] ? __raw_spin_unlock+0x5/0x10 [zfs] [12412.277417] ? dmu_read_uio_dnode+0xf1/0x130 [zfs] [12412.282396] ? kfree+0xd3/0x250 [12412.285543] ? xas_load+0x8/0x80 [12412.288785] ? find_get_entry+0xd6/0x1c0 [12412.292711] ? _cond_resched+0x15/0x30 [12412.296465] spl_assert+0x17/0x20 [zfs] [12412.300494] mappedread+0x136/0x140 [zfs] [12412.304690] zfs_read+0x165/0x2e0 [zfs] [12412.308728] zpl_iter_read+0xa8/0x110 [zfs] [12412.313105] ? __handle_mm_fault+0x44f/0x6c0 [12412.317379] new_sync_read+0x10f/0x150 [12412.321138] vfs_read+0x91/0x140 [12412.324372] ksys_read+0x4f/0xb0 [12412.327613] do_syscall_64+0x5b/0x1a0 [12412.331279] entry_SYSCALL_64_after_hwframe+0x65/0xca [12412.336334] RIP: 0033:0x7fa34855cab4 [12412.339912] Code: Unable to access opcode bytes at RIP 0x7fa34855ca8a. [12412.346437] RSP: 002b:00007ffc7a17caf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [12412.354002] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa34855cab4 [12412.361135] RDX: 0000000000100000 RSI: 00007fa2ddc99010 RDI: 0000000000000005 [12412.368268] RBP: 00007fa2ddc99010 R08: 0000000000000000 R09: 0000000000000000 [12412.375402] R10: 0000000020f2cfd0 R11: 0000000000000246 R12: 0000000000100000 [12412.382534] R13: 000055f6f3432ec0 R14: 0000000000100000 R15: 000055f6f3432ee8 [12534.766588] INFO: task kworker/u128:1:32743 blocked for more than 120 seconds. [12534.773814] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12534.781556] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12534.789382] task:kworker/u128:1 state:D stack: 0 pid:32743 ppid: 2 flags:0x80004080 [12534.797737] Workqueue: writeback wb_workfn (flush-zfs-915) [12534.803228] Call Trace: [12534.805684] __schedule+0x2d1/0x830 [12534.809183] schedule+0x35/0xa0 [12534.812330] cv_wait_common+0x153/0x240 [spl] [12534.816698] ? finish_wait+0x80/0x80 [12534.820287] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12534.825806] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12534.831216] zfs_putpage+0x13b/0x5b0 [zfs] [12534.835498] ? pmdp_collapse_flush+0x10/0x10 [12534.839777] ? rmap_walk_file+0x116/0x290 [12534.843790] ? __mod_memcg_lruvec_state+0x5d/0x160 [12534.848581] zpl_putpage+0x67/0xd0 [zfs] [12534.852692] write_cache_pages+0x197/0x420 [12534.856798] ? zpl_readpage_filler+0x10/0x10 [zfs] [12534.861764] zpl_writepages+0x98/0x130 [zfs] [12534.866210] do_writepages+0xc2/0x1c0 [12534.869878] ? __wb_calc_thresh+0x3a/0x120 [12534.873975] __writeback_single_inode+0x39/0x2f0 [12534.878594] writeback_sb_inodes+0x1e6/0x450 [12534.882866] __writeback_inodes_wb+0x5f/0xc0 [12534.887137] wb_writeback+0x247/0x2e0 [12534.890806] wb_workfn+0x346/0x4d0 [12534.894213] ? __switch_to_asm+0x35/0x70 [12534.898136] ? __switch_to_asm+0x41/0x70 [12534.902064] ? __switch_to_asm+0x35/0x70 [12534.905990] ? __switch_to_asm+0x41/0x70 [12534.909915] ? __switch_to_asm+0x35/0x70 [12534.913840] ? __switch_to_asm+0x41/0x70 [12534.917768] ? __switch_to_asm+0x35/0x70 [12534.921696] ? __switch_to_asm+0x41/0x70 [12534.925620] process_one_work+0x1a7/0x360 [12534.929635] ? create_worker+0x1a0/0x1a0 [12534.933559] worker_thread+0x30/0x390 [12534.937226] ? create_worker+0x1a0/0x1a0 [12534.941152] kthread+0x10a/0x120 [12534.944392] ? set_kthread_struct+0x40/0x40 [12534.948576] ret_from_fork+0x35/0x40 [12534.952215] INFO: task fio:1125717 blocked for more than 120 seconds. [12534.958656] Tainted: P OE --------- - - 4.18.0-408.el8.x86_64 #1 [12534.966395] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [12534.974219] task:fio state:D stack: 0 pid:1125717 ppid:1125444 flags:0x00000080 [12534.982828] Call Trace: [12534.985282] __schedule+0x2d1/0x830 [12534.988770] schedule+0x35/0xa0 [12534.991920] cv_wait_common+0x153/0x240 [spl] [12534.996287] ? finish_wait+0x80/0x80 [12534.999867] zfs_rangelock_enter_writer+0x46/0x1b0 [zfs] [12535.005359] zfs_rangelock_enter_impl+0x110/0x170 [zfs] [12535.010757] zfs_write+0x615/0xd70 [zfs] [12535.014861] zpl_iter_write+0xe0/0x120 [zfs] [12535.019307] ? __handle_mm_fault+0x44f/0x6c0 [12535.023576] new_sync_write+0x112/0x160 [12535.027418] vfs_write+0xa5/0x1a0 [12535.030735] ksys_write+0x4f/0xb0 [12535.034054] do_syscall_64+0x5b/0x1a0 [12535.037722] entry_SYSCALL_64_after_hwframe+0x65/0xca [12535.042773] RIP: 0033:0x7fcb60d50a17 [12535.046359] Code: Unable to access opcode bytes at RIP 0x7fcb60d509ed. [12535.052889] RSP: 002b:00007ffce0bbd250 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [12535.060454] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fcb60d50a17 [12535.067585] RDX: 0000000000100000 RSI: 00007fcaf648d010 RDI: 0000000000000005 [12535.074718] RBP: 00007fcaf648d010 R08: 0000000000000000 R09: 0000000000000000 [12535.081851] R10: 000055aeb180bf80 R11: 0000000000000293 R12: 0000000000100000 [12535.088987] R13: 000055aeb1803ec0 R14: 0000000000100000 R15: 000055aeb1803ee8 Signed-off-by: Brian Atkinson <[email protected]>
behlendorf
pushed a commit
that referenced
this pull request
Feb 22, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
behlendorf
pushed a commit
that referenced
this pull request
May 28, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
behlendorf
pushed a commit
that referenced
this pull request
May 30, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
no clue if i did it right (1st time github, 10years ago that i did c), but seems to work on my system.
Gregor