-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-11736 test: CR Pass 2 - Dangling pool map test #12517
Conversation
1. Create a pool. 2. Stop servers. 3. Manually remove /mnt/daos0/<pool_uuid>/vos-0 from rank 0 node. 4. Enable and start the checker. 5. Query the checker and verify that the issue was fixed. i.e., Current status is COMPLETED. 6. Disable the checker. 7. Verify that the pool has one less target. Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: test_dangling_pool_map Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
Bug-tracker data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: test_dangling_pool_map Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: test_dangling_pool_map Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-12517/4/testReport/ |
Checker isn’t detecting/fixing the issue. |
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-12517/5/testReport/ |
… shard Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
|
||
# 3. Manually remove /mnt/daos0/<pool_uuid>/vos-0 from rank 0 node. | ||
rank_0_host = NodeSet(self.server_managers[0].get_host(0)) | ||
rm_cmd = f"sudo rm /mnt/daos0/{pool.uuid.lower()}/vos-0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoding /mnt/daos0
, we should do something like:
self.server_managers[0].get_config_value("scm_mount")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
total_targets = query_out["response"]["total_targets"] | ||
active_targets = query_out["response"]["active_targets"] | ||
diff = total_targets - active_targets | ||
if diff != 1: | ||
expected_targets = total_targets - 1 | ||
msg = (f"Unexpected number of active targets! Expected = {expected_targets}; " | ||
f"Actual = {active_targets}") | ||
errors.append(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor - recommend making this check more clear
total_targets = query_out["response"]["total_targets"] | |
active_targets = query_out["response"]["active_targets"] | |
diff = total_targets - active_targets | |
if diff != 1: | |
expected_targets = total_targets - 1 | |
msg = (f"Unexpected number of active targets! Expected = {expected_targets}; " | |
f"Actual = {active_targets}") | |
errors.append(msg) | |
total_targets = query_out["response"]["total_targets"] | |
active_targets = query_out["response"]["active_targets"] | |
expected_targets = total_targets - 1 | |
if active_targets != expected_targets: | |
msg = (f"Unexpected number of active targets! Expected = {expected_targets}; " | |
f"Actual = {active_targets}") | |
errors.append(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Add checker restart logic in case checker doesn't detect fault. Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 5 Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-12517/7/execution/node/772/log |
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 5 Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 5 Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 6 Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-12517/12/testReport/ |
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 7 Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true Skip-fault-injection-test: true Test-tag: pool_membership Test-repeat: 7 Required-githooks: true Signed-off-by: Makito Kano <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Skip-unit-tests: true
Skip-fault-injection-test: true
Test-tag: test_dangling_pool_map
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: