Skip to content

Commit

Permalink
DAOS-XXXX cksum: retry once on checksum mismatch on update
Browse files Browse the repository at this point in the history
Unlike fetch, we return DER_CSUM on update (turned into EIO by dfs) without
any retry. We should retry at least once in case it is a transient error.

The patch also prints more information about the actual checksum mismatch.

Signed-off-by: Johann Lombardi <[email protected]>
  • Loading branch information
johannlombardi committed Aug 27, 2024
1 parent e3869b7 commit 4f48964
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 3 deletions.
4 changes: 4 additions & 0 deletions src/common/checksum.c
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,10 @@ daos_csummer_compare_csum_info(struct daos_csummer *obj,
a->cs_len);
}

if (unlikely(!match))
D_ERROR("Checksum mismatch at index %d/%d "DF_CI" != "DF_CI"\n", i, a->cs_nr,
DP_CI(ci_idx2csum(a, i)), DP_CI(ci_idx2csum(b, i)));

return match;
}

Expand Down
14 changes: 11 additions & 3 deletions src/object/cli_obj.c
Original file line number Diff line number Diff line change
Expand Up @@ -4760,9 +4760,17 @@ obj_comp_cb(tse_task_t *task, void *data)
obj_auxi->tx_uncertain = 1;
else
obj_auxi->nvme_io_err = 1;
} else if (task->dt_result != -DER_NVME_IO) {
/* Don't retry update for CSUM & UNCERTAIN errors */
obj_auxi->io_retry = 0;
} else {
if (task->dt_result == -DER_CSUM) {
/** Retry once on checksum error on update */
if (!obj_auxi->csum_retry)
obj_auxi->csum_retry = 1;
else
obj_auxi->io_retry = 0;
else if (task->dt_result != -DER_NVME_IO) {
/* Don't retry update for UNCERTAIN errors */
obj_auxi->io_retry = 0;
}
}
} else {
obj_auxi->io_retry = 0;
Expand Down

0 comments on commit 4f48964

Please sign in to comment.