DAOS-9595 chk: consolidate pool membership #9865

Nasf-Fan · 2022-08-01T03:06:27Z

When DAOS check start, all involved check engines will report their
known pools' information, including the pool service replicas, pool
label and related storage allocation, to the check leader via reply.

After the pool list consolidation in the pass_1, for each pool, the
check leader will send related pool information to its pool service
leaders via new RPC - CHK_POOL_MBS.

On the check engine side, the pool service leader compares the pool
map with these information pushed from the check leader and handles
the following cases:

An target has some allocated storage but does not appear in the
pool map. Under such case, the associated space will be deleted
from the engine by default.
An target has some allocated storage and is marked as "DOWN" or
"DOWNOUT" in the pool map. For this case, the administrator can
decide to either remove or leave it there.
An target is referenced in the pool map ("NEW", "UP", "UPIN" or
"DRAIN"), but no storage is actually allocated on this engine.
Under such case, the entry for the target in the pool map will
be marked as "DOWN" (for the "UP", "UPIN" or "DRAIN" entry) or
"DOWNOUT" (for the "NEW" entry).

Temporarily skip code format check against src/chk/chk_internal.h
and src/mgmt/rpc.h to avoid fake warning messages.

Signed-off-by: Fan Yong [email protected]

github-actions · 2022-08-01T03:06:47Z

Bug-tracker data:
Ticket title is 'pass2: scan & report allocated storage'
Status is 'In Review'
Labels: '531nth,triaged'
https://daosio.atlassian.net/browse/DAOS-9595

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

When DAOS check start, all involved check engines will report their known pools' information, including the pool service replicas, pool label and related storage allocation, to the check leader via reply. After the pool list consolidation in the pass_1, for each pool, the check leader will send related pool information to its pool service leaders via new RPC - CHK_POOL_MBS. On the check engine side, the pool service leader compares the pool map with these information pushed from the check leader and handles the following cases: 1. An target has some allocated storage but does not appear in the pool map. Under such case, the associated space will be deleted from the engine by default. 2. An target has some allocated storage and is marked as "DOWN" or "DOWNOUT" in the pool map. For this case, the administrator can decide to either remove or leave it there. 3. An target is referenced in the pool map ("NEW", "UP", "UPIN" or "DRAIN"), but no storage is actually allocated on this engine. Under such case, the entry for the target in the pool map will be marked as "DOWN" (for the "UP", "UPIN" or "DRAIN" entry) or "DOWNOUT" (for the "NEW" entry). Temporarily skip code format check against src/chk/chk_internal.h and src/mgmt/rpc.h to avoid fake warning messages. Signed-off-by: Fan Yong <[email protected]>

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

liw · 2022-08-05T02:04:53Z

src/pool/srv_pool.c

+}
+
+int
+ds_pool_svc_flush_map(struct ds_pool_svc *ds_svc, struct pool_map *map, uint32_t version)


[Question] Is it intentional that we do not schedule rebuild jobs when updating the pool map?

The pool map update is driven by the DAOS check instead of regular rebuild. The logic is something like that:
For each target that reported as part of the pool, compare with the pool map and fix related inconsistency; and then handle those non-accessed (in former comparison) pool map entries. During these process, there will be yield because of RPC or interaction with admin. All related pool map fixes are in DRAM in this step. After all done, it will call ds_pool_svc_flush_map() to persistently change the pool map and broadcast the changes to other pool shards.

liw · 2022-08-05T02:07:51Z

src/pool/srv_pool.c

+		/*
+		 * Have toresign to avoid handling future requests with stale pool map cache.
+		 * Continue to distribute the new pool map to other pool shards since the RDB
+		 * has already been updated.
+		 */
+		rdb_resign(svc->ps_rsvc.s_db, svc->ps_rsvc.s_term);


Failing to update the local map implies that the local secondary group for this pool may not have the latest membership. In this state, it's simpler to just give up, instead of continuing to take some chance. When a new leader steps up, it will distribute the new pool map, schedule rebuild jobs, (and in the future also do what replace_failed_replicas does).

Under DAOS check mode, in spite of PS leader itself is down or some other is down during checking the pool membership, then the DAOS check for this pool will be marked as aborted.
On the other hand, if the PS leader switches to other engine because of short time network split without engine down, then it will not cause DAOS check to be failed even if both the old PS leader and the new PS leader do DAOS pool membership check in parallel. It may cause some redundant check, but not error.
So here, the current PS leader must has the latest membership. It has already updated the pool map in RDB, but failed to refresh the pool map (attached to the pool instance) in-DRAM. If we give up, means the DAOS check for this pool will be failed, but if we try to distributed the update to other engines, we may have chance to continue the DAOS check.

The rest of the PS code assumes that the local secondary group always reflects the latest pool map. At least, I'd suggest skipping the following the map_dist and reconfigure calls in this case to avoid adding burden to non-chk code. Also, do you really intend to return the nonzero rc to chk like this patch does?

OK, if do not have others to change, I prefer to adjust it in the subsequent patch #9867 which will be rebased after landing this one.

As for the return value, I think it is better to return it to the caller. It is the caller's duty to determine the next step. For CHK case, if do not specify "failout", then will go ahead.

How do you think?

OK, fine with me.

liw · 2022-08-05T02:21:43Z

src/pool/srv_pool_check.c

+		if (file == NULL) {
+			D_ERROR(DF_UUIDF": failed to allocate file name for shards status %d\n",
+				DP_UUID(uuid), i);
+			D_GOTO(out_path, rc = -DER_NOMEM);


Do we leak clue->pc_tgt_status on the error paths?

Right, I will refresh the patch to fix it.

liw · 2022-08-05T02:31:48Z

src/mgmt/rpc.h

+	X(MGMT_TGT_SHARD_DESTROY,					\
+		0, &CQF_mgmt_tgt_shard_destroy,				\
+		ds_mgmt_hdlr_tgt_shard_destroy, NULL)


By the way, do we need to bump DAOS_MGMT_VERSION above for this change?

hmm, do we need to add support for proto query like we did with object and pool RPCs?

Currently, it seems unnecessary because we do not support interoperation among servers.

liw · 2022-08-05T02:39:04Z

src/chk/chk_leader.c

+
+	D_ALLOC_ARRAY(cpr->cpr_mbs, cpr->cpr_shard_nr);
+	if (cpr->cpr_mbs == NULL)
+		D_GOTO(out, rc = -DER_NOMEM);


On this error path we'll finalize an rsvc_client that has not been initialized. Could we avoid doing this even if it might work at the moment, please?

OK, I will refresh the patch.

liw · 2022-08-05T02:47:35Z

src/chk/chk_engine.c

+		 * Flush the pool map to persistent storage (if not under dryrun mode)
+		 * and distribute the pool map to other pool shards.
+		 */
+		rc1 = ds_pool_svc_flush_map(svc, map, version);


[Question] The map object already includes the version; the version parameter is unnecessary, isn't it? On the other hand, do you think we should pass a version for ds_pool_svc_flush_map to check before writing the new map? For instance,

read the map: version x change the map if the old map is not version x try reading and changing again write the new map

The map object already includes the version; the version parameter is unnecessary, isn't it?

Right, I will drop the redundant parameter.

On the other hand, do you think we should pass a version for ds_pool_svc_flush_map to check before writing the new map? For instance,

Under check mode, we disabled node eviction and do not allow reintegration. Means that there will be no pool map refresh except the DAOS check logic itself update the pool map. So it is unnecessary to re-check the pool map version before ds_pool_svc_flush_map(). On the other hand, if the ds_pool_svc_flush_map() is called under non-check mode, then the sponsor needs to do as you described.

daosbuild1 · 2022-08-05T16:04:04Z

src/common/pool_map.c

+pool_map_bump_version(struct pool_map *map)
+{
+	map->po_version++;
+	D_DEBUG(DB_TRACE, "Bumb pool map to version %u\n", map->po_version);


Suggested change

D_DEBUG(DB_TRACE, "Bumb pool map to version %u\n", map->po_version);

D_DEBUG(DB_TRACE, "Bump pool map to version %u\n", map->po_version);

Updated patch

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1 · 2022-08-08T06:55:15Z

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/14/execution/node/1046/log

jolivier23

question about proto query. If we have a new version for mgmt RPC, do we need to add query?

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1 · 2022-08-09T04:12:40Z

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/15/execution/node/320/log

daosbuild1 · 2022-08-09T04:14:05Z

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/15/execution/node/364/log

daosbuild1 · 2022-08-09T04:14:21Z

Test stage Build RPM on Leap 15 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/15/execution/node/323/log

daosbuild1 · 2022-08-09T04:18:45Z

Test stage Build on Leap 15 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/15/execution/node/342/log

daosbuild1 · 2022-08-09T04:43:22Z

Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9865/15/execution/node/328/log

daosbuild1

LGTM. No errors found by checkpatch.

daosbuild1

LGTM. No errors found by checkpatch.

Nasf-Fan · 2022-08-10T14:35:44Z

@liw @liuxuezhao @jolivier23 , would you please to help review the patch? Thanks!

jolivier23

Do we need a proto query for the MGMT RPC change?

Nasf-Fan · 2022-08-12T07:30:17Z

@mjmac , would you please to help to hand this one? Then I can rebase the subsequent, thanks!

Nasf-Fan · 2022-08-12T07:31:02Z

Do we need a proto query for the MGMT RPC change?

Currently, it seems unnecessary because we do not support interoperation among servers.

Nasf-Fan requested a review from a team as a code owner August 1, 2022 03:06

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 009ab53 to 7f0aa42 Compare August 1, 2022 03:33

Nasf-Fan requested a review from a team as a code owner August 1, 2022 03:33

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 7f0aa42 to 40b3869 Compare August 1, 2022 03:47

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 40b3869 to f9b9b36 Compare August 1, 2022 04:01

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from f9b9b36 to d5fca4d Compare August 1, 2022 04:11

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from d5fca4d to 3594b5d Compare August 1, 2022 04:23

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan removed request for a team August 1, 2022 04:28

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 3594b5d to b00ef00 Compare August 1, 2022 04:36

daosbuild1 reviewed Aug 1, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from b00ef00 to 93b2031 Compare August 2, 2022 02:20

Nasf-Fan requested review from liw and liuxuezhao August 2, 2022 02:21

Nasf-Fan mentioned this pull request Aug 2, 2022

DAOS-9595 chk: consolidate pool membership #9611

Closed

daosbuild1 reviewed Aug 2, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 93b2031 to c906da6 Compare August 3, 2022 15:52

daosbuild1 reviewed Aug 3, 2022

View reviewed changes

daosbuild1 reviewed Aug 4, 2022

View reviewed changes

liw reviewed Aug 5, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from c906da6 to cc381a0 Compare August 5, 2022 16:02

daosbuild1 previously requested changes Aug 5, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from cc381a0 to 3d9c88a Compare August 5, 2022 16:07

daosbuild1 reviewed Aug 5, 2022

View reviewed changes

Nasf-Fan requested review from liw and jolivier23 August 7, 2022 13:09

daosbuild1 reviewed Aug 7, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 3d9c88a to 929dccd Compare August 8, 2022 04:08

daosbuild1 reviewed Aug 8, 2022

View reviewed changes

jolivier23 reviewed Aug 8, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from 929dccd to b668567 Compare August 9, 2022 04:03

daosbuild1 reviewed Aug 9, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from b668567 to dd5b38f Compare August 9, 2022 10:08

daosbuild1 reviewed Aug 9, 2022

View reviewed changes

Nasf-Fan force-pushed the Nasf-Fan/DAOS-9595_1 branch from dd5b38f to d096386 Compare August 10, 2022 01:25

daosbuild1 reviewed Aug 10, 2022

View reviewed changes

Nasf-Fan requested a review from jolivier23 August 10, 2022 14:35

jolivier23 approved these changes Aug 11, 2022

View reviewed changes

liw approved these changes Aug 12, 2022

View reviewed changes

Nasf-Fan requested a review from mjmac August 12, 2022 07:29

mjmac merged commit 2b6d5fd into feature/cat_recovery Aug 12, 2022

mjmac deleted the Nasf-Fan/DAOS-9595_1 branch August 12, 2022 15:46

	D_DEBUG(DB_TRACE, "Bumb pool map to version %u\n", map->po_version);
	D_DEBUG(DB_TRACE, "Bump pool map to version %u\n", map->po_version);

DAOS-9595 chk: consolidate pool membership #9865

DAOS-9595 chk: consolidate pool membership #9865

Conversation

Nasf-Fan commented Aug 1, 2022

github-actions bot commented Aug 1, 2022 • edited Loading

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nasf-Fan Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nasf-Fan Aug 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 commented Aug 8, 2022

jolivier23 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 commented Aug 9, 2022

daosbuild1 commented Aug 9, 2022

daosbuild1 commented Aug 9, 2022

daosbuild1 commented Aug 9, 2022

daosbuild1 commented Aug 9, 2022

daosbuild1 left a comment

Choose a reason for hiding this comment

daosbuild1 left a comment

Choose a reason for hiding this comment

Nasf-Fan commented Aug 10, 2022

jolivier23 left a comment

Choose a reason for hiding this comment

Nasf-Fan commented Aug 12, 2022

Nasf-Fan commented Aug 12, 2022

github-actions bot commented Aug 1, 2022 •

edited

Loading

Nasf-Fan Aug 12, 2022 •

edited

Loading

Nasf-Fan Aug 5, 2022 •

edited

Loading