Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

offline / online during attach fails to produce a safe mirror #784

Open
pgdh opened this issue Jan 22, 2021 · 0 comments
Open

offline / online during attach fails to produce a safe mirror #784

pgdh opened this issue Jan 22, 2021 · 0 comments

Comments

@pgdh
Copy link

pgdh commented Jan 22, 2021

Try this ...

# dd if=/dev/urandom bs=1024k count=10240 of=$SOMEPATH/d1
# zpool create play $SOMEPATH/d1
# dd if=/dev/urandom bs=1024k count=8192 of=/Volumes/play/f1
# dd if=/dev/urandom bs=1024k count=10240 of=$SOMEPATH/d2
# zpool attach play $SOMEPATH/d1 $SOMEPATH/d2
(wait a few seconds)
# zpool offline play $SOMEPATH/d2
(wait a few seconds, confirm that resilver is still running with zpool status play)
# zpool online play $SOMEPATH/d2
(wait until resilver is finished, checking with zpool status play)

Here's one I made earlier ...

# zpool status play
  pool: play
 state: ONLINE
  scan: resilvered 1.90G in 0 days 00:00:20 with 0 errors on Fri Jan 22 17:40:44 2021
config:

	NAME                        STATE     READ WRITE CKSUM
	play                        ONLINE       0     0     0
	  mirror-0                  ONLINE       0     0     0
	    /Volumes/touch/tmp/d1   ONLINE       0     0     0
	    /Volumes/touch/tmp/d2   ONLINE       0     0     0

errors: No known data errors
#

All fine and dandy, right?

But then ...

# zpool scrub play
(wait for scrub to finish, again checking with zpool status play)
# zpool status play
  pool: play
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 6.10G in 0 days 00:00:34 with 0 errors on Fri Jan 22 17:42:30 2021
config:

	NAME                        STATE     READ WRITE CKSUM
	play                        ONLINE       0     0     0
	  mirror-0                  ONLINE       0     0     0
	    /Volumes/touch/tmp/d1   ONLINE       0     0     0
	    /Volumes/touch/tmp/d2   ONLINE       0     0 48.8K

errors: No known data errors
# zpool -V
zfs-1.9.4-0
zfs-kmod-1.9.4-0
# uname -a
Darwin Holistix-MBP.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
#

i.e. Catalina 10.15.7

This is not reproducible on SmartOS ...

# uname -a
SunOS ingleby 5.11 joyent_20201217T173522Z i86pc i386 i86pc
# 

or Linux (Proxmox) ...

# zpool -V
zfs-0.8.5-pve1
zfs-kmod-0.8.5-pve1
# uname -a
Linux annie 5.4.73-1-pve #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) x86_64 GNU/Linux
#

The above is a contrived case, but I started investigating when it happened for ral as I was slurping data between a couple of Samsung T7 Touch drives (as part of a process of turning off the T7's native encryption).

Both Linux and SmartOS stop the resilver as soon as one drive is taken offline, and resilvers from the beginning when it is brought back online. This is what OpenZFSonOSX needs to do.

Sometime soon, I will dip my toes in OpenZFS 2.0 on macOS port. It may well be that this bug disappears there, in which case it's another win for the unified code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant