Skip to content

Commit

Permalink
DAOS-16175 container: fix a case for cont_iv_hdl_fetch
Browse files Browse the repository at this point in the history
In reintegrate case, ever hit case that the IC_CONT_CAPA cache is valid locally
but cont open handle invalid (not in dt_cont_hdl_hash). For this case
invalidate local IV cache first and retry again, to avoid in-flight UPDATE's
failure because obj_ioc_init() -> ds_cont_find_hdl() ->
cont_iv_hdl_fetch() failure -
DBUG src/engine/server_iv.c:409 ivc_on_fetch() FETCH: Key [1:7] entry 0x7fb31063b550 valid yes
DBUG src/engine/server_iv.c:1042 iv_op_internal() class_id 7 opc 1 rc 0
ERR  src/object/srv_obj.c:2174 obj_ioc_begin_lite()
Failed to initialize object I/O context.: DER_NO_HDL(-1002): 'Invalid handle'

Signed-off-by: Xuezhao Liu <[email protected]>
  • Loading branch information
liuxuezhao committed Oct 25, 2024
1 parent ab744a4 commit ee2e8b5
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 2 deletions.
14 changes: 14 additions & 0 deletions src/container/container_iv.c
Original file line number Diff line number Diff line change
Expand Up @@ -1018,6 +1018,7 @@ cont_iv_hdl_fetch(uuid_t cont_hdl_uuid, uuid_t pool_uuid,
D_DEBUG(DB_TRACE, "Can not find "DF_UUID" hdl\n",
DP_UUID(cont_hdl_uuid));

invalidate_retry:
/* Fetch the capability from the leader. To avoid extra locks,
* all metadatas are maintained by xstream 0, so let's create
* an ULT on xstream 0 to let xstream 0 to handle capa fetch
Expand Down Expand Up @@ -1046,6 +1047,19 @@ cont_iv_hdl_fetch(uuid_t cont_hdl_uuid, uuid_t pool_uuid,
if (*cont_hdl == NULL) {
D_DEBUG(DB_TRACE, "Can not find "DF_UUID" hdl\n",
DP_UUID(cont_hdl_uuid));
/* In reintegrate with case that the IC_CONT_CAPA cache is valid locally
* but cont open handle invalid (not in dt_cont_hdl_hash). For this case
* invalidate local IV cache first and retry again, to avoid in-flight
* UPDATE's failure. (IV locally valid then the IV fetch will not trigger
* cont_iv_ent_update() callback).
*/
if (!invalidate_current) {
invalidate_current = true;
ABT_eventual_free(&eventual);
D_DEBUG(DB_TRACE, DF_UUID" invalidate_current and retry\n",
DP_UUID(cont_hdl_uuid));
goto invalidate_retry;
}
D_GOTO(out_eventual, rc = -DER_NONEXIST);
}

Expand Down
7 changes: 5 additions & 2 deletions src/object/cli_obj.c
Original file line number Diff line number Diff line change
Expand Up @@ -6310,7 +6310,9 @@ obj_ec_get_parity_or_alldata_shard(struct obj_auxi_args *obj_auxi, unsigned int
shard_idx = grp_start + i;
if (obj_shard_is_invalid(obj, shard_idx, DAOS_OBJ_RPC_ENUMERATE)) {
if (++fail_cnt > obj_ec_parity_tgt_nr(oca)) {
D_ERROR(DF_OID" reach max failure "DF_RC"\n",
D_ERROR(DF_CONT", obj "DF_OID" reach max failure "DF_RC"\n",
DP_CONT(obj->cob_pool->dp_pool,
obj->cob_co->dc_uuid),
DP_OID(obj->cob_md.omd_id), DP_RC(-DER_DATA_LOSS));
D_GOTO(out, shard = -DER_DATA_LOSS);
}
Expand Down Expand Up @@ -6457,7 +6459,8 @@ obj_list_shards_get(struct obj_auxi_args *obj_auxi, unsigned int map_ver,
}

if (rc < 0) {
D_ERROR(DF_OID" Can not find shard grp %d: "DF_RC"\n",
D_ERROR(DF_CONT", obj "DF_OID" Can not find shard grp %d: "DF_RC"\n",
DP_CONT(obj->cob_pool->dp_pool, obj->cob_co->dc_uuid),
DP_OID(obj->cob_md.omd_id), grp_idx, DP_RC(rc));
D_GOTO(out, rc);
}
Expand Down

0 comments on commit ee2e8b5

Please sign in to comment.