Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when running 'zpool split' #5565

Closed
smaeul opened this issue Jan 6, 2017 · 7 comments
Closed

Panic when running 'zpool split' #5565

smaeul opened this issue Jan 6, 2017 · 7 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@smaeul
Copy link

smaeul commented Jan 6, 2017

System information

Type Version/Name
Distribution Name Gentoo
Distribution Version stable
Linux Kernel 4.7.10-hardened
Architecture amd64
ZFS Version 0.6.5.8-r0-gentoo
SPL Version 0.6.5.8-r0-gentoo

Describe the problem you're observing

I received this panic when trying to run zpool split system blah:

zed[4221]: eid=7 class=config.sync pool=system
zed[4223]: eid=8 class=statechange
zed[4240]: eid=9 class=config.sync pool=blah
kernel: VERIFY(size != 0) failed
kernel: PANIC at range_tree.c:172:range_tree_add()
kernel: Showing stack for process 1299
kernel: CPU: 0 PID: 1299 Comm: txg_sync Tainted: P           O    4.7.10-hardened #1
kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97E-ITX/ac, BIOS P2.00 10/12/2015
kernel:  0000000000000000 ffffffff814b6a3f 0000000000000007 ffffffffa058c3d1
kernel:  ffffc900097b3ba8 ffffffffa02bb8e6 00ff880417cc1700 ffffffff00000028
kernel:  ffffc900097b3bb8 ffffc900097b3b58 7328594649524556 30203d2120657a69
kernel: Call Trace:
kernel:  [<ffffffff814b6a3f>] ? dump_stack+0x47/0x68
kernel:  [<ffffffffa058c3d1>] ? _fini+0x152ec/0x4661a [zfs]
kernel:  [<ffffffffa02bb8e6>] ? spl_panic+0xb6/0xe0 [spl]
kernel:  [<ffffffffa04c7f7b>] ? arc_buf_thaw+0x7b/0xb0 [zfs]
kernel:  [<ffffffffa04d39df>] ? dbuf_dirty+0x48f/0x880 [zfs]
kernel:  [<ffffffff819693d9>] ? mutex_lock+0x9/0x30
kernel:  [<ffffffffa0577e68>] ? _fini+0xd83/0x4661a [zfs]
kernel:  [<ffffffffa04d95a0>] ? dmu_buf_rele_array.part.4+0x30/0x50 [zfs]
kernel:  [<ffffffff811ec899>] ? kfree+0x29/0x170
kernel:  [<ffffffff811ebc15>] ? __slab_free+0x95/0x260
kernel:  [<ffffffffa04d2b31>] ? dbuf_read+0x5c1/0x790 [zfs]
kernel:  [<ffffffff819693d9>] ? mutex_lock+0x9/0x30
kernel:  [<ffffffffa04d39df>] ? dbuf_dirty+0x48f/0x880 [zfs]
kernel:  [<ffffffffa059b4ae>] ? _fini+0x243c9/0x4661a [zfs]
kernel:  [<ffffffffa050b989>] ? range_tree_add+0x269/0x290 [zfs]
kernel:  [<ffffffffa02b6ccf>] ? spl_kmem_zalloc+0x8f/0x160 [spl]
kernel:  [<ffffffffa02b6ccf>] ? spl_kmem_zalloc+0x8f/0x160 [spl]
kernel:  [<ffffffffa050b720>] ? range_tree_destroy+0x60/0x60 [zfs]
kernel:  [<ffffffffa050be2d>] ? range_tree_walk+0x2d/0x50 [zfs]
kernel:  [<ffffffffa05287e9>] ? vdev_dtl_sync+0xd9/0x3a0 [zfs]
kernel:  [<ffffffffa0528b3d>] ? vdev_sync+0x8d/0x110 [zfs]
kernel:  [<ffffffffa051488f>] ? spa_sync+0x3bf/0xae0 [zfs]
kernel:  [<ffffffff81118a62>] ? autoremove_wake_function+0x22/0x40
kernel:  [<ffffffffa0524d09>] ? txg_sync_thread+0x3a9/0x5e0 [zfs]
kernel:  [<ffffffffa0524960>] ? txg_quiesce_thread+0x380/0x380 [zfs]
kernel:  [<ffffffffa02b8dc7>] ? thread_generic_wrapper+0x67/0x80 [spl]
kernel:  [<ffffffffa02b8d60>] ? __thread_exit+0x10/0x10 [spl]
kernel:  [<ffffffff810fa1b8>] ? kthread+0xb8/0xd0
kernel:  [<ffffffff8196b72e>] ? ret_from_fork+0x1e/0x50
kernel:  [<ffffffff810fa100>] ? kthread_worker_fn+0x180/0x180

The zpool split command, any further zfs/zpool commands, and sync all became uninterruptible (D state).

Describe how to reproduce the problem

I have the following pool I attempted to split. It is a mirror of two dm-crypt volumes. It is less than a year old. It was originally created with a single disk, and the second was attached several hours later. No further changes to the layout were made until now.

NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
system  3.62T  1.17T  2.46T         -    17%    32%  1.00x  ONLINE  -
  mirror  3.62T  1.17T  2.46T         -    17%    32%
    cryptb      -      -      -         -      -      -
    crypta      -      -      -         -      -      -

Include any warning/errors/backtraces from the system logs

See above.

@smaeul
Copy link
Author

smaeul commented Jan 6, 2017

After forcibly rebooting, zpool status shows as below:

  pool: system
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 2h44m with 0 errors on Thu Jan  5 18:46:17 2017
config:

        NAME        STATE     READ WRITE CKSUM
        system      DEGRADED     0     0     0
          mirror-0  DEGRADED     0     0     0
            cryptb  ONLINE       0     0     0
            crypta  OFFLINE      0     0     0

errors: No known data errors

And after zpool online

  pool: system
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub in progress since Fri Jan  6 06:51:56 2017
    28.3M scanned out of 1.17T at 2.02M/s, 168h0m to go
    0 repaired, 0.00% done
config:

        NAME        STATE     READ WRITE CKSUM
        system      DEGRADED     0     0     0
          mirror-0  DEGRADED     0     0     0
            cryptb  ONLINE       0     0     0
            crypta  SPLIT        0     0     0  split into new pool

errors: No known data errors

@behlendorf
Copy link
Contributor

@smaeul thanks for reporting this issue. It appears the failure was caused by a zero length entry in the dirty time log (DTL) for the mirror when the split was requested. The DTL is used to track the set of transaction groups for which the vdev has less than perfect replication. You should be able to recover full redundancy in the pool by performing a zpool replace as described. Then I'd suggest scrubbing the pool for good measure before retrying the zpool split.

@rstrlcpy
Copy link
Contributor

The possible fix. We tested it internally, but would be nice to get confirmation from ZFS Jedi.

From an available crash-dump I see that the freed VDEV is accessed, because it is on txg's DTL of synced VDEV, because was transferred by vdev_top_transfer(), that was called by vdev_remove_parent(), that was called by vdev_split().

"detach" does the cleanup, but it seems for "split" this cleanup was not implemented.

diff --git a/usr/src/uts/common/fs/zfs/spa.c b/usr/src/uts/common/fs/zfs/spa.c
index 2466662ac3..cd5d681d80 100644
--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -5736,6 +5736,18 @@ spa_vdev_split_mirror(spa_t *spa, char *newname, nvlist_t *config,
                dmu_tx_abort(tx);
        for (c = 0; c < children; c++) {
                if (vml[c] != NULL) {
+                       vdev_t *tvd = vml[c]->vdev_top;
+
+                       /*
+                        * Need to be sure the detachable VDEV is not
+                        * on any *other* txg's DTL list to prevent it
+                        * from being accessed after it's freed.
+                        */
+                       for (int t = 0; t < TXG_SIZE; t++) {
+                               (void) txg_list_remove_this(
+                                   &tvd->vdev_dtl_list, vml[c], t);
+                       }
+
                        vdev_split(vml[c]);
                        if (error == 0)
                                spa_history_log_internal(spa, "detach", tx,

@behlendorf
Copy link
Contributor

@ramzec thanks for digging in to this. Your analysis and fix look correct to me, would you mind opening a PR with the proposed change.

@rstrlcpy
Copy link
Contributor

ok. will do.

@richardelling
Copy link
Contributor

Roman, I think I hit this years ago, but don’t have any remaining evidence. Nor do I recall filing a bug at Nexenta at the time. Do you believe it is reproducible in ZTS?

@rstrlcpy
Copy link
Contributor

rstrlcpy commented Sep 1, 2018

We caught this error a month ago during split of system pool. Many times we tried to reproduce in on different hosts. Nothing. We have only one host where it can be reproduced 100%. So that I'm not sure it can be simply reproduced in ZTS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants
@behlendorf @richardelling @rstrlcpy @smaeul and others