From 64b23dc48696de33adbdbe245ef569f449796a3a Mon Sep 17 00:00:00 2001 From: Matthew Ahrens Date: Fri, 28 Apr 2017 09:34:57 -0700 Subject: [PATCH] OpenZFS 8166 - zpool scrub thinks it repaired offline device Reviewed by: George Wilson george.wilson@delphix.com If we do a scrub while a leaf device is offline (via "zpool offline"), we will inadvertently clear the DTL (dirty time log) of the offline device, even though it is still damaged. When the device comes back online, we will incompletely resilver it, thinking that the scrub repaired blocks written before the scrub was started. The incomplete resilver can lead to data loss if there is a subsequent failure of a different leaf device. The fix is to never clear the DTL of offline devices. Note that if a device is onlined while a scrub is in progress, the scrub will be restarted. The problem can be worked around by running "zpool scrub" after "zpool online". Closes #5806 --- module/zfs/vdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/module/zfs/vdev.c b/module/zfs/vdev.c index caf92899d0c8..fdbe02351b86 100644 --- a/module/zfs/vdev.c +++ b/module/zfs/vdev.c @@ -1858,6 +1858,9 @@ vdev_dtl_should_excise(vdev_t *vd) ASSERT0(scn->scn_phys.scn_errors); ASSERT0(vd->vdev_children); + if (vd->vdev_state < VDEV_STATE_DEGRADED) + return (B_FALSE); + if (vd->vdev_resilver_txg == 0 || range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0) return (B_TRUE);