Skip to content

Commit

Permalink
Implement workaround for pools affected by openzfs#2094 erratum
Browse files Browse the repository at this point in the history
ZoL commit 1421c89 unintentionally
changed the disk format in a forward-compatible, but not backward
compatible way. This was accomplished by adding an entry to zbookmark_t,
which is included in a couple of on-disk structures. That lead to the
creation of pools with incorrect dsl_scan_phys_t objects that could only
be imported by versions of ZFSOnLinux containing that commit. Such pools
cannot be imported by other versions of ZFS. That includes past and
future versions of ZoL. The additional field has been removed by a prior
patch. This patch permits affected pools to be imported and repaired by
implementing a workaround for the erratum. When this workaround is
triggered, a kernel alert describing this will be printed to dmesg
asking the system administrator to run a scrub. A subsequent scrub will
replace the damaged dsl_scan_phys_t object, repairing the pool.

Pools affected by the damaged dsl_scan_phys_t can be detected prior to
an upgrade by running the following command as root:

zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w

Note that `poolname` must be replaced with the name of the pool you wish
to check. A value of 25 indicates the dsl_scan_phys_t has been damaged.
A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0
indicates that there has never been a scrub run on the pool.

The regression caused by the change to zbookmark_t never made it into a
tagged release or any Gentoo backports. Only those using HEAD were
affected. However, this patch has a few limitations. There is no way to
detect a damaged dsl_scan_phys_t object when it has occurred on 32-bit
systems due to integer rounding that wrote incorrect information, but
avoided the overflow on them. Correcting such issues requires triggering
a scrub.

In addition, bptree_entry_phys_t objects are also affected. These
objects only exist during an asynchronous destroy and automating repair
of damaged bptree_entry_phys_t objects is non-trivial. Any pools that
have been imported by an affected version of ZoL must have all
asynchronous destroy operations finish before export and subsequent
import by a version containing this commit.  Failure to do that will
prevent pool import. The presence of any background destroys on any
imported pools can be checked by running `zpool get freeing` as root.
This will display a non-zero value for any pool with an active
asynchronous destroy.

Lastly, it is expected that no user data has been lost as a result of
this erratum.

Closes openzfs#2094

Original-patch-by: Tim Chase <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
  • Loading branch information
ryao committed Feb 5, 2014
1 parent 93634e8 commit 1801c48
Showing 1 changed file with 32 additions and 1 deletion.
33 changes: 32 additions & 1 deletion module/zfs/dsl_scan.c
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,38 @@ dsl_scan_init(dsl_pool_t *dp, uint64_t txg)
err = zap_lookup(dp->dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT,
DMU_POOL_SCAN, sizeof (uint64_t), SCAN_PHYS_NUMINTS,
&scn->scn_phys);
if (err == ENOENT)
/*
* Workaround for zfsonlinux/zfs#2094 erratum
*/
if (err == EOVERFLOW) {
uint64_t zaptmp[SCAN_PHYS_NUMINTS + 1];
VERIFY(SCAN_PHYS_NUMINTS == 24);
VERIFY(offsetof(dsl_scan_phys_t, scn_flags)
== (23 * sizeof(uint64_t)));

err = zap_lookup(dp->dp_meta_objset,
DMU_POOL_DIRECTORY_OBJECT,
DMU_POOL_SCAN, sizeof (uint64_t),
SCAN_PHYS_NUMINTS + 1,
&zaptmp);
if (err == 0) {
uint64_t overflow = zaptmp[SCAN_PHYS_NUMINTS];

if (overflow >> 1)
return (EOVERFLOW);

bcopy(zaptmp, &scn->scn_phys,
SCAN_PHYS_NUMINTS * sizeof(uint64_t));
scn->scn_phys.scn_flags = overflow;

printk(KERN_ALERT

This comment has been minimized.

Copy link
@behlendorf

behlendorf Feb 6, 2014

This came up in IRC but avoid the printk. It's not defined in userspace so this completely breaks the build. Use cmn_err() or zfs_dbgmsg().

"ZFS: Detected ZoL issue #2094 "
"on pool \"%s\". "
"Please run a scrub.",
spa->spa_name);

This comment has been minimized.

Copy link
@behlendorf

behlendorf Feb 6, 2014

I suspect most people will never notice this warning in their logs and take action on it. It would probably be better if we just automatically took care of this by scheduling a scrub and logging a message why.

}

This comment has been minimized.

Copy link
@dweeezil

dweeezil Feb 5, 2014

I'd feel better if there was a "return (err);" in case the second lookup fails with some other weird error because if it does, we're going to go right into the "if (scn->...)" stuff below.

This comment has been minimized.

Copy link
@behlendorf

behlendorf Feb 6, 2014

Me too, it shouldn't happen. But if it does we should handle the error by returning.

}
else if (err == ENOENT)
return (0);
else if (err)
return (err);
Expand Down

0 comments on commit 1801c48

Please sign in to comment.