-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved location claim algorithm #1008
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add claim function and target_n_val configuration into cuttlefish. Move modules around to try and make it more obvious where functions used in membership reside.
Stops fake wants being required to prompt claim on a location change. Allow for a claim module to implement a sort_members_for_choose(Ring, Members, Owners) -> SortedMembers function, to pre-sort the members being passed into claim_rebalance. Add further specs.
Location claim improved so that it will try to balance the spread of vnodes, if it reaches the end and is still unbalanced. Also uses a stronger meets_target_n to fallback to sequential_claim more reliably on incorrect spacing (of vnodes across nodes, but not yet across locations).
Extended potential for test by determining what nodes are safe to both add and remove from loops, rather than simply relying on sequential order.
Resolves some test issues. Also try harder to do safe removals when looking back in the ring (as opposed to the removal other additions)
* Support two transition changes Where the second transition is triggered by a change of location. Need to ensure that the location_changed status update is recognised in the ring * Unrelated fix to remove reference to gen_fsm_compat * unrelated fix to get rid of deprecation warning * Testing claim * The new claim algorithm as purely functional algorithm * add new entry for version 5 claiming * Refactor v5 into v4 * move impossible config test to place where we actually may enter recursion * Documentation The algorithm should be described in more detail in a markup document * Allow configurations with zero nodes in location for better placement update This works better when a location is emptied on nodes. Less transfers. * Keep order of nodes to avoid back translate issue --------- Co-authored-by: Martin Sumner <[email protected]>
Otherwise, if all vnodes have become excluded there is no escape from this condition (unless other traffic can trigger the creation of vnodes). This is helpful in situations where transfers are performed on standby clusters with no other traffic. This commit also logs a timing of the claim each time it is called.
To allow for the riak_core_ring_manager and riak_core_claimant to remember v4 solutions, they are shared via the state of the claimant
* Add an extra test to show owners may stay the same if only location changes This is not always the case, but holds when there is a solution in the first place * Fix type error that dialyzer could not find * Introduce necessary conditions to fallback to version 2 * update tests * Check whether it is worth to use brute force * make historic values the norm * Introduce nvals map type * Take nr nodes into account when checking for brute force cond. * Property to evaluate skipping brute force strategy * QuickCheck property starts with choosing ring size. * Remove fallback for necessary conditions * Filter tests to get away with flakyness * In order to re-run tests suite, remove strict precondition * Check in test suite * Replace claim_suite.suite by larger claim.suite * Sometimes it is worth to brute_force to a zero node violation * better documentation binring algorithm * Run property with a sufficient condition
This reverts commit 6527033.
…ak_core into mas-i1001-claimrefactor
The cache of v4 solutions is required by the ring_manager and the claimant - so specifically update both of these processes each time. Otherwise cache will be missed when the ring_manager calls to riak_core_claimant:ring_changed/2. There is a fix to the last gasp check before writing the ring file. prune_write_notify_ring function does not care if the write of a ring errors - so error for this function rather than crashing the ring manager. This otherwise causes instability in location tests.
* Remove pre-computed test suite * cleanup * Make claim_eqc tests not fail on weird configs by supplying a diverse list of options
The leave call on a failure of simple_transfer will call sequential_claim - which is part of the v2 claim family. Now we have v4, if this is configured it should call v4 as it does handle leaves.
…ak_core into mas-i1001-claimrefactor
Property temporarily changed to consider only failures with locations
* Use application env to read target_n_val * Re-introduce v2 in riak_core_claim_eqc * Move precondition to postcondition to also test less perfect cases * Fixed error in transfer_node usage * cleanup not using remove_from_cluster/5.
In this case - full_rebalance should be enabled
Add recommendation to use full_rebalance_on_leave for locations
@systream - as you did the original location awareness, I suspect you may have an interest in testing this. |
ThomasArts
approved these changes
Jun 12, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
* Change doc Change introduction to refer to vnodes and nodes. Removes the recommendation not to vary location_n_val and node_n_val. * Update comments on having different target n_vals * Further doc updates * Update docs/claim-version4.md * Update docs/claim-version4.md * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> --------- Co-authored-by: Thomas Arts <[email protected]>
martinsumner
added a commit
that referenced
this pull request
Jun 19, 2023
* Configuration and module alignment Add claim function and target_n_val configuration into cuttlefish. Move modules around to try and make it more obvious where functions used in membership reside. * Remove deprecated v1 claim/wants * Update framework for Claim Stops fake wants being required to prompt claim on a location change. Allow for a claim module to implement a sort_members_for_choose(Ring, Members, Owners) -> SortedMembers function, to pre-sort the members being passed into claim_rebalance. Add further specs. * Add choose_claim_v4 * Location claim improvements Location claim improved so that it will try to balance the spread of vnodes, if it reaches the end and is still unbalanced. Also uses a stronger meets_target_n to fallback to sequential_claim more reliably on incorrect spacing (of vnodes across nodes, but not yet across locations). * Refinements to claim_v4 Extended potential for test by determining what nodes are safe to both add and remove from loops, rather than simply relying on sequential order. * Correction following removal of log * Count remove list not Excess to determine loops * Better order of initial striping Resolves some test issues. Also try harder to do safe removals when looking back in the ring (as opposed to the removal other additions) * A new claim algorithm (#1003) * Support two transition changes Where the second transition is triggered by a change of location. Need to ensure that the location_changed status update is recognised in the ring * Unrelated fix to remove reference to gen_fsm_compat * unrelated fix to get rid of deprecation warning * Testing claim * The new claim algorithm as purely functional algorithm * add new entry for version 5 claiming * Refactor v5 into v4 * move impossible config test to place where we actually may enter recursion * Documentation The algorithm should be described in more detail in a markup document * Allow configurations with zero nodes in location for better placement update This works better when a location is emptied on nodes. Less transfers. * Keep order of nodes to avoid back translate issue --------- Co-authored-by: Martin Sumner <[email protected]> * Claim API requires export of 2-arity choose function as well as 3-arity * Always return indices Otherwise, if all vnodes have become excluded there is no escape from this condition (unless other traffic can trigger the creation of vnodes). This is helpful in situations where transfers are performed on standby clusters with no other traffic. This commit also logs a timing of the claim each time it is called. * Calculate swaps only once * Remember v4 solutions via claimant To allow for the riak_core_ring_manager and riak_core_claimant to remember v4 solutions, they are shared via the state of the claimant * Long-running tests * Adding an extra test (#1004) * Add an extra test to show owners may stay the same if only location changes This is not always the case, but holds when there is a solution in the first place * Fix type error that dialyzer could not find * Introduce necessary conditions to fallback to version 2 * update tests * Check whether it is worth to use brute force * make historic values the norm * Introduce nvals map type * Take nr nodes into account when checking for brute force cond. * Property to evaluate skipping brute force strategy * QuickCheck property starts with choosing ring size. * Remove fallback for necessary conditions * Filter tests to get away with flakyness * In order to re-run tests suite, remove strict precondition * Check in test suite * Replace claim_suite.suite by larger claim.suite * Sometimes it is worth to brute_force to a zero node violation * better documentation binring algorithm * Run property with a sufficient condition * Revert "Long-running tests" This reverts commit 6527033. * Test adjustments * Test adjustments * Add support for configured target_location_n_val * Memoise fixes The cache of v4 solutions is required by the ring_manager and the claimant - so specifically update both of these processes each time. Otherwise cache will be missed when the ring_manager calls to riak_core_claimant:ring_changed/2. There is a fix to the last gasp check before writing the ring file. prune_write_notify_ring function does not care if the write of a ring errors - so error for this function rather than crashing the ring manager. This otherwise causes instability in location tests. * Example configurations saves in source format (#1005) * Remove pre-computed test suite * cleanup * Make claim_eqc tests not fail on weird configs by supplying a diverse list of options * Add full-rebalance for v4 The leave call on a failure of simple_transfer will call sequential_claim - which is part of the v2 claim family. Now we have v4, if this is configured it should call v4 as it does handle leaves. * Support leave in prop_claim * Update - to use correct claim_fun on leave Property temporarily changed to consider only failures with locations * Use application env to read target_n_val (#1007) * Use application env to read target_n_val * Re-introduce v2 in riak_core_claim_eqc * Move precondition to postcondition to also test less perfect cases * Fixed error in transfer_node usage * cleanup not using remove_from_cluster/5. * Add warning if simple_transfer produces unbalanced result In this case - full_rebalance should be enabled * only_swap/swap_only confusion Add recommendation to use full_rebalance_on_leave for locations * Update riak_core_claim_eqc.erl * Mas i1001 docupdate (#1009) * Change doc Change introduction to refer to vnodes and nodes. Removes the recommendation not to vary location_n_val and node_n_val. * Update comments on having different target n_vals * Further doc updates * Update docs/claim-version4.md * Update docs/claim-version4.md * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> --------- Co-authored-by: Thomas Arts <[email protected]> --------- Co-authored-by: Thomas Arts <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a new v4 claim algorithm, which significantly improves the efficiency of claim with location awareness.
The algorithm has a brute force approach, whereby it will in the plan stage potentially do an exhaustive search of the possible swaps and moves, in order to provide a ring which meets the target n_val for both nodes and locations. Firstly by trying to minimise the number of changes, and if that is unsuccessful it will attempt to find a valid solution without concern for the plan complexity.
To enable the feature, set via riak.conf:
choose_claim_fun
->choose_claim_v4
full_rebalance_onleave
->on
target_n_val
->4
(default)target_location_n_val ->
3` (default)riak_test testa are added and updated in basho/riak_test#1375.
There is a write-up in docs/claim-version4.md.
In some cases cluster plans may take much longer than at present, potentially > 60s.
The new claim algorithm was developed by Ulf Norell and Thomas Arts.