scrub vacuously completing instantly on pool from OmniOS #5898

rincebrain · 2017-03-17T07:22:23Z

Describe the problem you're observing

If you create a pool on OmniOS r151020, then import it on ZoL 0.6.5.9, the pool will appear to function fine and be read/write, but any attempts to scrub will do almost no IO before returning success.

Describe how to reproduce the problem

create a 4-disk raidz2 pool on OmniOS r151020
write some data to it
import on ZoL 0.6.5.9
attempt a scrub
watch as the scrub completes infinitely quickly without almost any drive activity

Conveniently, I have a set of VDIs from testing this that are suitable.
These are a raidz2 generated on OmniOS r151020:
https://www.dropbox.com/s/77vkxo1q0y7teeu/omnios%20pool%20issue.zip?dl=1
These are a raidz2 generated on Debian Jessie with ZoL 0.6.5.9:
https://www.dropbox.com/s/mo2jv20gnlcqkv0/omnios%20pool%20issue%20jessie.zip?dl=1

(This was originally reported by someone coming into IRC who had made a raidz2 pool on OmniOS, then had to move to a new machine, found OmniOS didn't run on it, so he moved to Linux, then found this issue. I was surprised to find it reproducible so readily.)

My reproduction is based on writing GBs of data to the pool, restarting to be positive that it can't be keeping the pages from the pool in cache, then running scrub and watching the IO, or lack thereof. I included the more empty disk images just for convenience.

loli10K · 2017-03-17T15:57:45Z

any attempts to scrub will do almost no IO before returning success.

based on writing GBs of data to the pool

@rincebrain just so we are on the same page, can you reproduce this same issue on a pool filled with GBs of random data? Most of the data contained in the pool you uploaded is just zero-filled files, it's possible the hypervisor is being smart and feeding you zeros at much higher rates you'd be usually able to get.

My limited testing shows that capping the iops on the 4 VDI i'm able to produce more predictable results.

With --total_iops_sec 3 on every virtual disk:

root@debian-8-zfs:~# zpool status testpool
  pool: testpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub in progress since Fri Mar 17 16:35:06 2017
	305M scanned out of 2.67G at 623K/s, 1h6m to go
	102K repaired, 11.16% done
config:

	NAME        STATE     READ WRITE CKSUM
	testpool    ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    sdb     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0
	    sdd     ONLINE       0     0     0
	    sde     ONLINE       0     0 2.08K  (repairing)

errors: No known data errors
root@debian-8-zfs:~# zpool iostat testpool 1
              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
testpool    2.67G  13.2G     66      3  5.49M  70.4K
testpool    2.67G  13.2G     11      0  1.50M      0
testpool    2.67G  13.2G     11      0  1.50M      0
testpool    2.67G  13.2G     11      0  1.50M      0
testpool    2.67G  13.2G     11      0  1.37M      0
testpool    2.67G  13.2G     11      0   288K      0
^C

tannerdsilva · 2017-04-18T23:14:02Z

@loli10K I'm experiencing the same issue with my BSD11-created pool. If this is at all helpful, my data is not mostly zeros. (My ticket is referenced above, #6038)

rincebrain · 2017-04-18T23:17:24Z

Drat, I thought I replied to this saying that even if my reproduction was broken, this was a legitimate problem someone was having that I was trying to reproduce.

Of course, here we are with someone on non-vacuous data having this issue. :)

loli10K · 2017-04-21T05:47:39Z

@rincebrain since this reproduction is broken can we close this and keep the discussion in a single issue (#6038)?

bunder2015 mentioned this issue Apr 18, 2017

Potential incompatibility with BSD-created pools #6038

Closed

rincebrain closed this as completed Apr 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scrub vacuously completing instantly on pool from OmniOS #5898

scrub vacuously completing instantly on pool from OmniOS #5898

rincebrain commented Mar 17, 2017

loli10K commented Mar 17, 2017

tannerdsilva commented Apr 18, 2017

rincebrain commented Apr 18, 2017

loli10K commented Apr 21, 2017

scrub vacuously completing instantly on pool from OmniOS #5898

scrub vacuously completing instantly on pool from OmniOS #5898

Comments

rincebrain commented Mar 17, 2017

Describe the problem you're observing

Describe how to reproduce the problem

loli10K commented Mar 17, 2017

tannerdsilva commented Apr 18, 2017

rincebrain commented Apr 18, 2017

loli10K commented Apr 21, 2017