Skip to content

Commit

Permalink
OpenZFS 8166 - zpool scrub thinks it repaired offline device
Browse files Browse the repository at this point in the history
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Matthew Ahrens <[email protected]>

If we do a scrub while a leaf device is offline (via "zpool offline"),
we will inadvertently clear the DTL (dirty time log) of the offline
device, even though it is still damaged.  When the device comes back
online, we will incompletely resilver it, thinking that the scrub
repaired blocks written before the scrub was started.  The incomplete
resilver can lead to data loss if there is a subsequent failure of a
different leaf device.

The fix is to never clear the DTL of offline devices.  Note that if a
device is onlined while a scrub is in progress, the scrub will be
restarted.

The problem can be worked around by running "zpool scrub" after
"zpool online".

OpenZFS-issue: https://www.illumos.org/issues/8166
OpenZFS-commit: openzfs/openzfs#372
Closes #5806 
Closes #6103
  • Loading branch information
ahrens authored and behlendorf committed May 10, 2017
1 parent f486f58 commit 335b251
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions module/zfs/vdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -1868,6 +1868,9 @@ vdev_dtl_should_excise(vdev_t *vd)
ASSERT0(scn->scn_phys.scn_errors);
ASSERT0(vd->vdev_children);

if (vd->vdev_state < VDEV_STATE_DEGRADED)
return (B_FALSE);

if (vd->vdev_resilver_txg == 0 ||
range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0)
return (B_TRUE);
Expand Down

0 comments on commit 335b251

Please sign in to comment.