-
Notifications
You must be signed in to change notification settings - Fork 305
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DAOS-14021 pool: enable md dup op detection (#13536)
With this change, pool and container metadata duplicate operation detection logic is enabled in daos_engine handler code. Previous patches established the handler structure but avoided saving RPC metadata operation history in the "svc_ops" KVS in a pool's rdb. This is enabled mainly by this patch arranging for the svc_ops KVS entries to be ordered in client-provided hybrid logical clock (HLC) time order. This is accomplished by adding a new RDB KVS class RDB_OID_CLASS_LEXICAL, used by the svc_ops KVS, that maps to vos object type DAOS_OT_MULTI_LEXICAL. And by the engine handlers encoding the received client keys such that the btree memcmp-based direct key comparison will result in time-ordered entries. Also with this change, the svc_ops KVS within rdb enforces a maximum capacity, calculated from a default "maximum age" in seconds goal (5 minutes), and an assumed maximum reasonable metadata RPC rate of 4K ops/sec. When processing every metadata RPC request, the number of entries in svc_ops is evaluated, and the oldest entry is evaluated for whether it exceeds the maximum age goal. If the maximum number of entries exists, or the oldest entry exceeds the age, then the oldest item is removed before potentially inserting another entry due to the execution of the current handler. Future changes may clean multiple of the first entries in the KVS if they all exceed the maximum age goal (even if the total entries capacity has not been reached). In the suite/daos_test pool_op_retry() and co_op_retry() tests, and in the corresponding engine handling code, two new fault injections are added: DAOS_MD_OP_PASS_NOREPLY_NEWLDR DAOS_MD_OP_FAIL_NOREPLY_NEWLDR These faults cause the engine handler to either fully execute (successfully) or simulate a (non-retryable) handling error (-DER_MISC), then force the client to retry the RPC by sending a (retryable, -DER_TIMEDOUT) error, and also step down as the pool service leader. All of this is done so that when the client RPC retry is sent, it is handled by a new pool service leader engine, and so that the correct duplicate operation detection behavior is still seen even in the presence of this leadership change. Additionally in this PR, logic and metadata ops (rdb) KVS creation/interaction is refactored / centralized in the pool engine code, with internal-server API calls made from container code. Signed-off-by: Kenneth Cain <[email protected]>
- Loading branch information
Showing
25 changed files
with
845 additions
and
383 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.