-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-9599 chk: handle inconsistent pool label #9524
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
The pool label is mainly used by MS to lookup pool UUID by label. The PS recorded pool label is some kind of the backup. So if the MS known pool label does not match the pool label recorded by PS, then trust MS as long as MS known pool label is not NULL. Related repair will not happen on check leader, instead, it will be done on the PS leader during CHK__CHECK_SCAN_PHASE__CSP_POOL_CLEANUP. If the admin wants to recover pool label on MS with the pool info on PS, then it will use DRPC_METHOD_CHK_REG_POOL drpc upcall. The patch also contains some other fixes: 1. Only handle dangling pool within the given pools check list. 2. If some engine failed to report some pool information, then related pool cannot be handled as no quorum case. 3. Some code cleanup for "uint32_t" and "int32_t" usage in chk. Signed-off-by: Fan Yong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine to me, but it would be best to get another review from someone who is more familiar with the engine side of things. I am a better reviewer for control plane and protobuf updates.
The pool label is mainly used by MS to lookup pool UUID by label.
The PS recorded pool label is some kind of the backup. So if the
MS known pool label does not match the pool label recorded by PS,
then trust MS as long as MS known pool label is not NULL. Related
repair will not happen on check leader, instead, it will be done
on the PS leader during CHK__CHECK_SCAN_PHASE__CSP_POOL_CLEANUP.
If the admin wants to recover pool label on MS with the pool info
on PS, then it will use DRPC_METHOD_CHK_REG_POOL drpc upcall.
The patch also contains some other fixes:
Only handle dangling pool within the given pools check list.
Some code cleanup for "uint32_t" and "int32_t" usage in chk.
Signed-off-by: Fan Yong [email protected]
Required-githooks: true