-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16329 chk: maintenance mode after checking pool with dryrun #14984
Conversation
Ticket title is 'Maintenance mode after CR checking the pool with dryrun option' |
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14984/1/testReport/ |
Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14984/1/testReport/ |
030f098
to
0ede219
Compare
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14984/2/testReport/ |
Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14984/2/testReport/ |
0ede219
to
c07ad29
Compare
@@ -209,7 +209,7 @@ def subtract(self, val): | |||
"DER_NOTLEADER(-2008): 'Not service leader'") | |||
|
|||
# Functions that are never reported as errors. | |||
IGNORED_FUNCTIONS = ('sched_watchdog_post', 'rdb_timerd') | |||
IGNORED_FUNCTIONS = ('sched_watchdog_post', 'rdb_timerd', 'cont_open') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we ignoring cont_open
here instead of fixing the errors? And what are the errors we're ignoring?
c00f9b9
to
679a20d
Compare
679a20d
to
ad50e21
Compare
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14984/8/execution/node/1508/log |
ad50e21
to
e1b57a1
Compare
Sometimes, after system shutdown unexpectedly, the users may expect to check their critical data under some kind of maintenance mode. Under such mode, no user data can be modified or moved or aggregated. That will guarantee no further potential (DAOS logic caused) damage can happen during the check. For such purpose, we will enhance current DAOS CR logic with --dryrun option to allow the pool (after check) to be opened as immutable with disabling some mechanism that may potentially cause data modification or movement (such as rebuild or aggregation). Under such mode, if client wants to connect to the pool, the read-only option must be specified. Similarly for opening container in such pool. Test-tag: pr cat_recov Allow-unstable-test: true Signed-off-by: Fan Yong <[email protected]>
e1b57a1
to
7011933
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will reserve approval for those familiar with core code, I think we will need to force land for the existing NLT failure
Ping @daos-stack/daos-gatekeeper , thanks! |
I'm not familiar enough to merge myself without reviews from relevant @daos-stack/metadata-owners and @daos-stack/client-api-owners. Will leave this to another gatekeeper or let those owners approve first. |
Ping @gnailzenh @NiuYawei @daos-stack/daos-gatekeeper , thanks! |
Sometimes, after system shutdown unexpectedly, the users may expect to check their critical data under some kind of maintenance mode. Under such mode, no user data can be modified or moved or aggregated. That will guarantee no further potential (DAOS logic caused) damage can happen during the check.
For such purpose, we will enhance current DAOS CR logic with --dryrun option to allow the pool (after check) to be opened as immutable with disabling some mechanism that may potentially cause data modification or movement (such as rebuild or aggregation).
Under such mode, if client wants to connect to the pool, the read-only option must be specified. Similarly for opening container in such pool.
Test-tag: pr cat_recov
Allow-unstable-test: true
Signed-off-by: Fan Yong [email protected]
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: