Mistake adding log device as single-drive vdev seems unrecoverable #6907

tesujimath · 2017-11-29T03:29:09Z

I mistyped the zpool command to add a log device, so it got added as a single-disk vdev, alongside all my raidz1 vdevs. Each of them are around 5TB, and the wannabee log device is an 8GB ZeusRAM drive (i.e. tiny).

I now can't remove the device, and my zpool is in a really vulnerable state. And has no log device.

This is a busy production fileserver, and since there is 56TB of data in this zpool, copying the data to another fileserver so I can destroy and re-create the zpool is very unattractive. I see #3371 would be a solution. Alternatively, perhaps I could install another OpenZFS implementation (such as Delphix), and use that to recover my zpool, then revert back to ZFS on Linux. Or use some development branch of ZFS on Linux for the recovery, if such a thing exists.

Any suggestions here on what would be possible will be gratefully received.

System information

Type	Version/Name
Distribution Name	CentOS
Distribution Version	7.4
Linux Kernel	3.10.0-693.2.2.el7.x86_64
Architecture	x86_64
ZFS Version	0.7.3-1
SPL Version	0.7.3-1

Describe the problem you're observing

I mistakenly added a single drive to a zpool as a vdev rather than a log device. Now I can't remove it.

Describe how to reproduce the problem

[Edit: See later comment for reproducible sequence]

I mistyped the command to add a log device to my pool. Instead of this:

# zpool add z102 log H35

I typed this:

# zpool add z102 H35

Trying to remove it gives me this error:

# zpool remove z102 H35
cannot remove H35: only inactive hot spares, cache, or log devices can be removed

Here's what my zpool looks like now:

# zpool status
  pool: z102
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 11h57m with 0 errors on Sun Nov  5 06:57:06 2017
config:

	NAME         STATE     READ WRITE CKSUM
	z102         ONLINE       0     0     0
	  raidz1-0   ONLINE       0     0     0
	    A1       ONLINE       0     0     0
	    A2       ONLINE       0     0     0
	    A3       ONLINE       0     0     0
	    A4       ONLINE       0     0     0
	    A5       ONLINE       0     0     0
	  raidz1-1   ONLINE       0     0     0
	    A6       ONLINE       0     0     0
	    A7       ONLINE       0     0     0
	    A8       ONLINE       0     0     0
	    A9       ONLINE       0     0     0
	    A10      ONLINE       0     0     0
	  raidz1-2   ONLINE       0     0     0
	    A11      ONLINE       0     0     0
	    A12      ONLINE       0     0     0
	    A13      ONLINE       0     0     0
	    A14      ONLINE       0     0     0
	    A15      ONLINE       0     0     0
	  raidz1-3   ONLINE       0     0     0
	    A16      ONLINE       0     0     0
	    A17      ONLINE       0     0     0
	    A18      ONLINE       0     0     0
	    A19      ONLINE       0     0     0
	    A20      ONLINE       0     0     0
	  raidz1-4   ONLINE       0     0     0
	    A21      ONLINE       0     0     0
	    A22      ONLINE       0     0     0
	    A23      ONLINE       0     0     0
	    A24      ONLINE       0     0     0
	    A25      ONLINE       0     0     0
	  raidz1-5   ONLINE       0     0     0
	    B1       ONLINE       0     0     0
	    B2       ONLINE       0     0     0
	    B3       ONLINE       0     0     0
	    B4       ONLINE       0     0     0
	    B5       ONLINE       0     0     0
	  raidz1-6   ONLINE       0     0     0
	    B6       ONLINE       0     0     0
	    B7       ONLINE       0     0     0
	    B8       ONLINE       0     0     0
	    B9       ONLINE       0     0     0
	    B10      ONLINE       0     0     0
	  raidz1-7   ONLINE       0     0     0
	    B11      ONLINE       0     0     0
	    B12      ONLINE       0     0     0
	    B13      ONLINE       0     0     0
	    B14      ONLINE       0     0     0
	    B15      ONLINE       0     0     0
	  raidz1-8   ONLINE       0     0     0
	    B16      ONLINE       0     0     0
	    B17      ONLINE       0     0     0
	    B18      ONLINE       0     0     0
	    B19      ONLINE       0     0     0
	    B20      ONLINE       0     0     0
	  raidz1-9   ONLINE       0     0     0
	    B21      ONLINE       0     0     0
	    B22      ONLINE       0     0     0
	    B23      ONLINE       0     0     0
	    B24      ONLINE       0     0     0
	    B25      ONLINE       0     0     0
	  H35        ONLINE       0     0     0
	  raidz1-11  ONLINE       0     0     0
	    C1       ONLINE       0     0     0
	    C2       ONLINE       0     0     0
	    C3       ONLINE       0     0     0
	    C4       ONLINE       0     0     0
	    C5       ONLINE       0     0     0
	  raidz1-12  ONLINE       0     0     0
	    C6       ONLINE       0     0     0
	    C7       ONLINE       0     0     0
	    C8       ONLINE       0     0     0
	    C9       ONLINE       0     0     0
	    C10      ONLINE       0     0     0
	  raidz1-13  ONLINE       0     0     0
	    C11      ONLINE       0     0     0
	    C12      ONLINE       0     0     0
	    C13      ONLINE       0     0     0
	    C14      ONLINE       0     0     0
	    C15      ONLINE       0     0     0
	  raidz1-14  ONLINE       0     0     0
	    C16      ONLINE       0     0     0
	    C17      ONLINE       0     0     0
	    C18      ONLINE       0     0     0
	    C19      ONLINE       0     0     0
	    C20      ONLINE       0     0     0
	  raidz1-15  ONLINE       0     0     0
	    C21      ONLINE       0     0     0
	    C22      ONLINE       0     0     0
	    C23      ONLINE       0     0     0
	    C24      ONLINE       0     0     0
	    C25      ONLINE       0     0     0

errors: No known data errors

# zfs list -d 0
NAME   USED  AVAIL  REFER  MOUNTPOINT
z102  56.0T  7.12T   257K  /export/z102

The text was updated successfully, but these errors were encountered:

tesujimath · 2017-11-29T03:36:37Z

Well, since Solaris ZFS ensures that the zpool add command can't change the level of redundancy without the -f option, this could be seen as a severe misfeature of ZFS on Linux.

tesujimath · 2017-11-29T03:40:28Z

If you're saying that is the documented behaviour of ZFS on Linux, then this is actually a bug, since the exact commands I showed above demonstrate a situation in which that check is not working.

rincebrain · 2017-11-29T03:46:46Z

@tesujimath So, presuming for the purpose of discussion that at some point you did run zpool add without -f and it added the device, could you please share what ZoL version was on the machine at the time?

(The contents of "zpool history | grep zpool" would probably also be useful, but also might have been absorbed in the morass of other commands often run on a pool in the intervening period.)

tesujimath · 2017-11-29T03:50:12Z

@kpande please believe me. I'm not making this up. Here's my recent history:

  979  zpool import
  980  zpool import z102
  986  zpool status
  989  zpool status
  990  zpool status | more
 1000  zpool status
 1005  zpool status
 1006  zpool remove z102 35000a7203008e6c0
 1007  zpool add z102 H35
 1008  zpool status
 1009  zpool remove z102 H35
 1010  zpool remove -f z102 H35
 1011  man zpool
 1012  zpool offline z102 H35
 1014  zpool status
 1019  man zpool
 1021  zpool remove z102 H36
 1022  zpool remove z102 H35
 1027  history | grep zpool

My goal was to rename the rawly named log device using its vdev-id alias.

@rincebrain The information you request is in the original issue comment. Or did I misunderstand you?

rincebrain · 2017-11-29T03:52:20Z

@tesujimath One of the main things that is unusual is that usually, for non-{log,cache} vdevs, it always orders them by when they were added to a pool, so the status output you're seeing suggests you added 5 of the raidz vdevs after the fateful zpool add command, which is why I was asking for confirmation that you are running the same ZoL version now that you were when you added the device.

The fact that it's showing up this way...is fascinating.

tesujimath · 2017-11-29T03:55:53Z

@rincebrain Yep, upgraded earlier today from ZFS 0.6.5.9 to 0.7.3, and 0.7.3 is definitely what was running when I erroneously added H35 to my pool. I rebooted twice after the upgrade before doing this. So no chance of old zfs kernel module hanging around. I'm using zfs-kmod, BTW.

Fascination is not my dominant emotion just now ...

tesujimath · 2017-11-29T04:00:41Z

@kpande Is it relevant that my zpool version hasn't been upgraded since 0.6.5.9, so the new zpool features in 0.7.3 have not been activated in the zpool?

rincebrain · 2017-11-29T04:05:24Z

@tesujimath Occupational hazard, my day job is primarily "huh, why did it do that", even when it's a five-alarm fire.

So, a few remarks, before I go on:

the conservative option would, of course, be recreating the pool, which is obviously highly inconvenient, but guaranteed to work in a finite time period
even if you used another platform's OpenZFS implementation, I don't think any of the standard platforms have incorporated the evacuation changes yet, so this would be rather bleeding edge, and even after you did the zpool remove, you'd need to keep running that change, versions of OpenZFS without it wouldn't be able to write to the pool (or probably even read, since it involves modifying where you look for data)
A not-ideal measure to preserve the pool from the ZeusRAM losing power causing the pool to die for right now would be to add a mirror of it against an 8G/16G/whatever flat file on the root filesystem or somewhere other than a file on the pool (because then you'd have a bootstrapping problem of needing the pool imported and FSes mounted to import the pool).
If you wanted, you could wait until evacuation got merged, but even then, you'd have to run the git master branch, 0.7.X is almost certainly not going to get things that aren't bugfixes added

The fact that you made the pool on 0.6.5.9 but did the remove and add on the pool after upgrading 0.7.3 is useful for reproduction, but we won't know if it's relevant to why this happened until after we get it reproducing somewhere else.

tesujimath · 2017-11-29T04:14:53Z

@rincebrain Thanks for those ideas. I am mulling it over.

Actually, the zpool was created several years ago, on ZFS 0.6.0-rc14 I think, and then zpool upgraded over the years. Current feature flags are these:

# zpool upgrade
This system supports ZFS pool feature flags.

All pools are formatted using feature flags.


Some supported features are not enabled on the following pools. Once a
feature is enabled the pool may become incompatible with software
that does not support the feature. See zpool-features(5) for details.

POOL  FEATURE
---------------
z102
      multi_vdev_crash_dump
      spacemap_histogram
      enabled_txg
      hole_birth
      extensible_dataset
      embedded_data
      bookmarks
      filesystem_limits
      large_blocks
      large_dnode
      sha512
      skein
      edonr
      userobj_accounting

gmelikov · 2017-11-29T05:30:35Z

This check is in cmd/zpool/zpool_vdev.c file, and ZoL's one has more checks instead of OpenZFS, it's the best place to dig in such case. I think if there is a problem - it may spread all over OpenZFS.

Screenshot:

tesujimath · 2017-11-30T00:40:05Z

@kpande I managed to reproduce the problem on a test server, with a brand newly created zpool. Here is the sequence that exhibits the nasty behviour. Essentially, we create a zpool with a log device, extend the pool, then remove the log device and add it as a standalone vdev (without -f). This sequence reliably reproduced the problem for me.

# zpool create ztest raidz1 T31 T32 log T33
# zpool add ztest raidz1 T34 T35
# zpool status ztest
  pool: ztest
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	ztest       ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    T31     ONLINE       0     0     0
	    T32     ONLINE       0     0     0
	  raidz1-2  ONLINE       0     0     0
	    T34     ONLINE       0     0     0
	    T35     ONLINE       0     0     0
	logs
	  T33       ONLINE       0     0     0

errors: No known data errors

# zpool remove ztest T33
# zpool add ztest T33   # this is a terrible mistake, omitted to say log
# zpool status ztest
  pool: ztest
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	ztest       ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    T31     ONLINE       0     0     0
	    T32     ONLINE       0     0     0
	  T33       ONLINE       0     0     0
	  raidz1-2  ONLINE       0     0     0
	    T34     ONLINE       0     0     0
	    T35     ONLINE       0     0     0

errors: No known data errors

And just to be clear, here's my version info:

# modinfo zfs | grep -iw version
version:        0.7.3-1
# rpm -qi zfs
Name        : zfs
Version     : 0.7.3
Release     : 1.el7_4
Architecture: x86_64
Install Date: Wed 22 Nov 2017 02:01:52 PM NZDT
Group       : System Environment/Kernel
Size        : 1052134
License     : CDDL
Signature   : RSA/SHA256, Fri 20 Oct 2017 07:36:02 AM NZDT, Key ID a9d5a1c0f14ab620
Source RPM  : zfs-0.7.3-1.el7_4.src.rpm
Build Date  : Fri 20 Oct 2017 07:29:56 AM NZDT
Build Host  : fedora-24-repo
Relocations : (not relocatable)
URL         : http://zfsonlinux.org/
Summary     : Commands to control the kernel modules and libraries
Description :
This package contains the ZFS command line utilities.

behlendorf · 2017-11-30T02:05:19Z

For reference the zpool_add_010_pos test case is designed to verify these safety checks. That test case is passing, but it doesn't include the zpool remove step , it operates on file vdevs, and it doesn't exactly run your specific case. It would be interesting to extend the test case and see if the issue can be reproduced.

You can run it locally from the ZoL source tree by running.

$ ./scripts/zfs-tests.sh -t tests/zfs-tests/tests/functional/cli_root/zpool_add/zpool_add_010_pos

Here's an except from the log.

20:42:45.77 SUCCESS: zpool create testpool1 mirror /testpool/vdev0 /testpool/vdev1
20:42:45.84 SUCCESS: zpool add testpool1 /testpool/vdev5 exited 1
20:42:46.24 SUCCESS: zpool add -f testpool1 /testpool/vdev5
20:42:46.42 SUCCESS: zpool destroy -f testpool1

I'm reopening this issue until this is understood.

rincebrain · 2017-11-30T02:54:07Z

I can confirm that doing the following reproduces this on 0.7.3, though I was confused to discover very strange failures of trying to run even existing zfs-tests/.../zpool_add/ tests on my vanilla CentOS 7 VM.

zpool create testpool raidz1 file1 file2;
zpool add testpool log file3;
zpool add testpool raidz1 file4 file5;
zpool remove testpool file3;
zpool add testpool file3;

The additional vdev after the log appears necessary, as doing this without it complains as expected on 0.7.3.

rincebrain · 2017-11-30T03:48:42Z

Who knew, mixing git master's zfs-tests with 0.7.3 doesn't work well. Shocking, I know.

I haven't opened a PR because I haven't done the linting and cleanup yet, but you can find a test in https://github.com/rincebrain/zfs/tree/6907_test

When the pool configuration contains a hole due to a previous device removal ignore this top level vdev. Failure to do so will result in the current configuration being assessed to have a non-uniform replication level and the expected warning will be disabled. The zpool_add_010_pos test case was extended to cover this scenario. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#6907

behlendorf · 2017-11-30T22:18:50Z

Proposed fix in #6911 with a shamelessly stolen version of @rincebrain's test case.

When the pool configuration contains a hole due to a previous device removal ignore this top level vdev. Failure to do so will result in the current configuration being assessed to have a non-uniform replication level and the expected warning will be disabled. The zpool_add_010_pos test case was extended to cover this scenario. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#6907

When the pool configuration contains a hole due to a previous device removal ignore this top level vdev. Failure to do so will result in the current configuration being assessed to have a non-uniform replication level and the expected warning will be disabled. The zpool_add_010_pos test case was extended to cover this scenario. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6907 Closes #6911

When the pool configuration contains a hole due to a previous device removal ignore this top level vdev. Failure to do so will result in the current configuration being assessed to have a non-uniform replication level and the expected warning will be disabled. The zpool_add_010_pos test case was extended to cover this scenario. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#6907 Closes openzfs#6911

behlendorf reopened this Nov 30, 2017

rincebrain added a commit to rincebrain/zfs that referenced this issue Nov 30, 2017

Added a quick and dirty test for openzfs#6907's failure

4797070

behlendorf mentioned this issue Nov 30, 2017

Fix 'zpool create|add' replication level check #6911

Merged

13 tasks

behlendorf closed this as completed in ea39f75 Dec 4, 2017

dd1dd1 mentioned this issue Jul 16, 2019

cannot zpool remove special allocation class device #9038

Closed

speed47 mentioned this issue Jul 29, 2022

zpool add allows mismatching redundancy after vdev removal #13705

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mistake adding log device as single-drive vdev seems unrecoverable #6907

Mistake adding log device as single-drive vdev seems unrecoverable #6907

tesujimath commented Nov 29, 2017 •

edited

Loading

tesujimath commented Nov 29, 2017

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017 •

edited

Loading

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017

gmelikov commented Nov 29, 2017

tesujimath commented Nov 30, 2017

behlendorf commented Nov 30, 2017

rincebrain commented Nov 30, 2017 •

edited

Loading

rincebrain commented Nov 30, 2017

behlendorf commented Nov 30, 2017

Mistake adding log device as single-drive vdev seems unrecoverable #6907

Mistake adding log device as single-drive vdev seems unrecoverable #6907

Comments

tesujimath commented Nov 29, 2017 • edited Loading

System information

Describe the problem you're observing

Describe how to reproduce the problem

tesujimath commented Nov 29, 2017

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017 • edited Loading

tesujimath commented Nov 29, 2017

rincebrain commented Nov 29, 2017

tesujimath commented Nov 29, 2017

gmelikov commented Nov 29, 2017

tesujimath commented Nov 30, 2017

behlendorf commented Nov 30, 2017

rincebrain commented Nov 30, 2017 • edited Loading

rincebrain commented Nov 30, 2017

behlendorf commented Nov 30, 2017

tesujimath commented Nov 29, 2017 •

edited

Loading

tesujimath commented Nov 29, 2017 •

edited

Loading

rincebrain commented Nov 30, 2017 •

edited

Loading