Skip to content

Commit

Permalink
DAOS-14976 object: properly select collective punch leader for resend (
Browse files Browse the repository at this point in the history
…#13602)

Before resending the collective punch RPC, we need to check whether the
original leader shard is valid or not. It is possible the object layout
has been shrinked after rebuild. Under such case, select a new shard as
the collective punch leader.

Signed-off-by: Fan Yong <[email protected]>
  • Loading branch information
Nasf-Fan authored Jan 23, 2024
1 parent e1d0cd7 commit 67723f1
Showing 1 changed file with 11 additions and 5 deletions.
16 changes: 11 additions & 5 deletions src/object/cli_obj.c
Original file line number Diff line number Diff line change
Expand Up @@ -7060,7 +7060,12 @@ dc_obj_coll_punch(tse_task_t *task, struct dc_object *obj, struct dtx_epoch *epo
if (rc != 0)
goto out;

leader = coa->coa_dct_nr;

if (auxi->io_retry) {
if (unlikely(spa->pa_auxi.shard >= obj->cob_shards_nr))
goto new_leader;

/* Try to reuse the same leader. */
rc = obj_shard_open(obj, spa->pa_auxi.shard, map_ver, &shard);
if (rc == 0) {
Expand All @@ -7078,10 +7083,13 @@ dc_obj_coll_punch(tse_task_t *task, struct dc_object *obj, struct dtx_epoch *epo
/* Then change to new leader for retry. */
}

/* Randomly select a rank as the leader. */
leader = d_rand() % coa->coa_dct_nr;

new_leader:
if (leader == coa->coa_dct_nr)
/* Randomly select a rank as the leader. */
leader = d_rand() % coa->coa_dct_nr;
else
leader = (leader + 1) % coa->coa_dct_nr;

dct = &coa->coa_dcts[leader];
len = dct->dct_bitmap_sz << 3;

Expand All @@ -7098,8 +7106,6 @@ dc_obj_coll_punch(tse_task_t *task, struct dc_object *obj, struct dtx_epoch *epo
}
}

/* Try another for leader. */
leader = (leader + 1) % coa->coa_dct_nr;
goto new_leader;

gen_mbs:
Expand Down

0 comments on commit 67723f1

Please sign in to comment.