Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool clear hang #3082

Closed
behlendorf opened this issue Feb 6, 2015 · 3 comments
Closed

zpool clear hang #3082

behlendorf opened this issue Feb 6, 2015 · 3 comments
Milestone

Comments

@behlendorf
Copy link
Contributor

The zpool clear command may hang when attempting to resume a pool. This is a regression which was introduced on all the platforms when feature flags were added. What's happening here is that the spa_get_stats() function always attempts to fetch the feature stats. When these buffers aren't cached in the ARC it results in an IO to a suspended pool which blocks. These IO needs to be eliminated either by always caching the feature stats or perhaps just by skipping them when the pool is known to be suspended.

[root@localhost ~]# ps aux | grep zpool
root      4347  0.0  0.0 133512  1356 pts/2    D+   19:12   0:00 zpool list
root      4376  0.0  0.0 133512  1356 pts/3    D+   19:12   0:00 zpool clear tank
root      4410  0.0  0.0 112644   968 pts/4    S+   19:12   0:00 grep --color=auto zpool
[root@localhost ~]# cat /proc/4376/stack
[<ffffffffa0574df5>] cv_wait_common+0x115/0x1d0 [spl]
[<ffffffffa0574ec5>] __cv_wait+0x15/0x20 [spl]
[<ffffffffa06a0bfb>] dbuf_read+0x39b/0x8b0 [zfs]
[<ffffffffa06a8799>] dmu_buf_hold+0x109/0x1c0 [zfs]
[<ffffffffa07070d8>] zap_lockdir+0x58/0x730 [zfs]
[<ffffffffa07079c4>] zap_cursor_retrieve+0x214/0x310 [zfs]
[<ffffffffa06de159>] spa_add_feature_stats+0x119/0x290 [zfs]
[<ffffffffa06e6062>] spa_get_stats+0x112/0x2d0 [zfs]
[<ffffffffa0716a99>] zfs_ioc_pool_stats+0x39/0x90 [zfs]
[<ffffffffa0719536>] zfsdev_ioctl+0x486/0x500 [zfs]
[<ffffffff811c2f25>] do_vfs_ioctl+0x2e5/0x4c0
[<ffffffff811c31a1>] SyS_ioctl+0xa1/0xc0
[<ffffffff815f2119>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
@behlendorf
Copy link
Contributor Author

This is related to #2878.

@nedbass
Copy link
Contributor

nedbass commented Feb 13, 2015

Turns out zpool clear on a suspended pool can get stuck down in spa_get_errlog_size() as well. This was after I forced nearly all metadata out of the cache with echo 1 > /sys/module/zfs/parameters/zfs_arc_meta_limit ; echo 3 > /proc/sys/vm/drop_caches.

$  bass6@3082_vm ~ /dev/pts/2 Thu Feb 12 17:27:15  >
ps -ef | grep clear
root     13045  8803  0 17:26 pts/0    00:00:00 sudo ./cmd/zpool/zpool clear tank
root     13046 13045  0 17:26 pts/0    00:00:00 /home/bass6/zfs/cmd/zpool/.libs/lt-zpool clear tank
$  bass6@3082_vm ~ /dev/pts/2 Thu Feb 12 17:27:19  >
sudo cat /proc/13046/stack
[<ffffffffa050cf2a>] spa_get_errlog_size+0x4a/0x1e0 [zfs]
[<ffffffffa0507f9e>] spa_get_stats+0xbe/0x2f0 [zfs]
[<ffffffffa0556e61>] zfs_ioc_pool_stats+0x31/0x70 [zfs]
[<ffffffffa055930b>] zfsdev_ioctl+0x51b/0x5f0 [zfs]
[<ffffffff8119f362>] vfs_ioctl+0x22/0xa0
[<ffffffff8119f984>] do_vfs_ioctl+0x84/0x5e0
[<ffffffff8119ff61>] sys_ioctl+0x81/0xa0
[<ffffffff8100b0b2>] system_call_fastpath+0x16/
$  bass6@3082_vm ~/zfs /dev/pts/2 Thu Feb 12 17:29:09 (b_3082) >
gdb ./module/zfs/zfs.ko
l(GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/bass6/zfs/module/zfs/zfs.ko...done.
(gdb) l *(spa_get_errlog_size+0x4a)
0x88f5a is in spa_get_errlog_size (/home/bass6/zfs/module/zfs/../../module/zfs/spa_errlog.c:143).
138     spa_get_errlog_size(spa_t *spa)
139     {
140             uint64_t total = 0, count;
141
142             mutex_enter(&spa->spa_errlog_lock);
143             if (spa->spa_errlog_scrub != 0 &&
144                 zap_count(spa->spa_meta_objset, spa->spa_errlog_scrub,
145                 &count) == 0)
146                     total += count;
147

@behlendorf
Copy link
Contributor Author

It looks like it must be down zap_count()->zap_lockdir()->dmu_buf_hold()->dbuf_read() similar to the features case.

nedbass added a commit to nedbass/zfs that referenced this issue Feb 26, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
The needed information is already cached in memory. Trying to read the
ZAPs from disk means that zpool clear would hang if the pool is
suspended and recovery would require a reboot.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Feb 26, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
The needed information is already cached in memory. Trying to read the
ZAPs from disk means that zpool clear would hang if the pool is
suspended and recovery would require a reboot.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Feb 27, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
The needed information is already cached in memory. Trying to read the
ZAPs from disk means that zpool clear would hang if the pool is
suspended and recovery would require a reboot.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Mar 2, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. We need to
ensure the list of features enabled on the pool stays resident in
memory.

There already exists an nvlist for this purpose called
ZPOOL_CONFIG_ENABLED_FEAT in spa->spa_load_info. The features_for_write
features aren't added to the list, however, in the final import phase.
This patch updates spa_load_impl() so the features_for_read and
features_for_write features are both added to ZPOOL_CONFIG_ENABLED_FEAT
in each import phase. The full list of pool features and their initial
reference counts are now available for use in spa_add_feature_stats().

In spa_add_feature_stats() we iterate over the spa_feature_table[]
rather than the ZAP objects read from disk. We operate on
ZPOOL_CONFIG_ENABLED_FEAT in-place and refresh the reference counts for
the supported features from the values cached in the spa_t. This is
sufficient for 'zpool get all' to display correct values for the
unsupported@ and feature@ properties. For the unsupported features, if
the reference count is 0 is shows 'inactive' otherwise it shows
'read-only'. In either case, the unsupported reference counts could not
have changed after import, so it is safe to rely on the initial values.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Mar 3, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. We need to
ensure the list of features enabled on the pool stays resident in
memory.

There already exists an nvlist for this purpose called
ZPOOL_CONFIG_ENABLED_FEAT in spa->spa_load_info. The features_for_write
features aren't added to the list, however, in the final import phase.
This patch updates spa_load_impl() so the features_for_read and
features_for_write features are both added to ZPOOL_CONFIG_ENABLED_FEAT
in each import phase. The full list of pool features and their initial
reference counts are now available for use in spa_add_feature_stats().

In spa_add_feature_stats() we iterate over the spa_feature_table[]
rather than the ZAP objects read from disk. We operate on
ZPOOL_CONFIG_ENABLED_FEAT in-place and refresh the reference counts for
the supported features from the values cached in the spa_t. This is
sufficient for 'zpool get all' to display correct values for the
unsupported@ and feature@ properties. For the unsupported features, if
the reference count is 0 is shows 'inactive' otherwise it shows
'read-only'. In either case, the unsupported reference counts could not
have changed after import, so it is safe to rely on the initial values.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Mar 4, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. We need to
ensure the list of features enabled on the pool stays resident in
memory.

There already exists an nvlist for this purpose called
ZPOOL_CONFIG_ENABLED_FEAT in spa->spa_load_info. The features_for_write
features aren't added to the list, however, in the final import phase.
This patch updates spa_load_impl() so the features_for_read and
features_for_write features are both added to ZPOOL_CONFIG_ENABLED_FEAT
in each import phase. The full list of pool features and their initial
reference counts are now available for use in spa_add_feature_stats().

In spa_add_feature_stats() we iterate over the spa_feature_table[]
rather than the ZAP objects read from disk. We operate on
ZPOOL_CONFIG_ENABLED_FEAT in-place and refresh the reference counts for
the supported features from the values cached in the spa_t. This is
sufficient for 'zpool get all' to display correct values for the
unsupported@ and feature@ properties. For the unsupported features, if
the reference count is 0 is shows 'inactive' otherwise it shows
'read-only'. In either case, the unsupported reference counts could not
have changed after import, so it is safe to rely on the initial values.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Mar 5, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. To keep the
feature stats resident in memory, we hang a cached nvlist off of the
spa.  It is built up from disk the first time spa_add_feature_stats() is
called, and refreshed thereafter using the cached feature reference
counts. spa_add_feature_stats() gets called at pool import time so we
can be sure the cached nvlist will be available if the pool is later
suspended.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
nedbass added a commit to nedbass/zfs that referenced this issue Mar 5, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. To keep the
feature stats resident in memory, we hang a cached nvlist off of the
spa.  It is built up from disk the first time spa_add_feature_stats() is
called, and refreshed thereafter using the cached feature reference
counts. spa_add_feature_stats() gets called at pool import time so we
can be sure the cached nvlist will be available if the pool is later
suspended.

Fixes openzfs#3082

Signed-off-by: Ned Bass <[email protected]>
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Apr 4, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. To keep the
feature stats resident in memory, we hang a cached nvlist off of the
spa.  It is built up from disk the first time spa_add_feature_stats() is
called, and refreshed thereafter using the cached feature reference
counts. spa_add_feature_stats() gets called at pool import time so we
can be sure the cached nvlist will be available if the pool is later
suspended.

Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#3082
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Apr 5, 2015
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. To keep the
feature stats resident in memory, we hang a cached nvlist off of the
spa.  It is built up from disk the first time spa_add_feature_stats() is
called, and refreshed thereafter using the cached feature reference
counts. spa_add_feature_stats() gets called at pool import time so we
can be sure the cached nvlist will be available if the pool is later
suspended.

Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#3082
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants