-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix replica unable trigger migration when it received CLUSTER SETSLOT in advance #981
Fix replica unable trigger migration when it received CLUSTER SETSLOT in advance #981
Conversation
logs (to see the full logs in https://github.com/valkey-io/valkey/actions/runs/10658482333/job/29539601550):
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #981 +/- ##
============================================
- Coverage 70.57% 70.46% -0.12%
============================================
Files 114 114
Lines 61660 61664 +4
============================================
- Hits 43515 43449 -66
- Misses 18145 18215 +70
|
@enjoy-binbin, your proposal looks intuitive and reasonable but I don't know if it matches the original intent of this flag. @madolson you mentioned in today's meeting that, in a k8s deployment that lacks a control plane, the engine is expected to make node placement decisions. This makes sense on a high level but I still don't think I understand the purpose of this flag. This flag is effective only in a scale down scenario. So with or without this flag, we will end up with surplus k8s nodes in the end, which is likely the ultimate purpose of this scale down operation. The difference is that without this flag, these surplus pods will be hosting empty shards; while with this flag, we don't have empty shards but we will have more replicas. So if the goal is to scale down the k8s cluster, I would have thought that it is easier to identify empty shards/pods than trying to remove extra replicas? I must've missed some important context. BTW, @enjoy-binbin, I believe some of the recent tests we worked on actually took dependency on this split migration policy (so we could control the failover behavior/timing). I wonder if they will be broken by this PR. In any case, this will be a breaking change. |
yes, indeed, i'm a bit fuzzy about this too.
the tests seems to be ok, it passed in the CI. I think under normal cases, the configurations is consistencies. And in our tests, we make it inconsistencies so here we are. There is a case, primary allow and the replica are not allow, what will happen if primary lose its last slot? In our code (both unstable and this branch), they both will become a replica, in this case, what does replica-migration not allow means in this replica side? I think when redis/redis#5285 was introduced, maybe a better way is added a If we worry too much, we can skip it. It's also good that the failed test exposed the problem so we can think about it later. (It should have been there before i think)
|
I was talking about replica barrier in conjunction with replica migration. The idea is that you say have 7 total pods, and then 3 primaries each with a replica. You set the replica barrier to 1. If a node fails, that shard will be down to one node, so the extra replica will migrate over and fill in the missing nodes as the other pod may eventually come online. FWIW, I know very few people that do this. However, some folks like to have strict shards, and they don't want replicas to ever migrate between nodes. This flag was originally introduced to handle the edge case where someone wanted to migrate data away from the shard then delete the pods. They didn't want the replicas to follow. So the tl;dr is that replica migration has larger implications, but this flag was introduced to solve a specific problem.
My honest impression of this configuration is that I don't care if they're different. Like, you can do a lot of things that are inconsistent. I would rather warn users about inconsistent configs then try to handle it perfectly. |
Discuss this with @PingXie offline, we will remove it from 8.0 and will consider it in the future. We will ignore this case in the failed test so that we can get a green CI. |
… in advance When the primary and replica cluster-allow-replica-migration configuration items are different, the replica migration does not meet expectations. Originally: primary not allow, replica allow. In this case, the code will migrate the replica away, and the primary will keep as an empty primary. However, in valkey-io#970, we found a timing issue. For example, if the replica receives CLUSTER SETSLOT first and then receives the gossip change, it will not trigger the replica-migration. We perform replica migration explicitly in CLUSTER SETSLOT in this case. Signed-off-by: Binbin <[email protected]>
87db9dd
to
3aef09e
Compare
I backup the old code to a different branch: https://github.com/enjoy-binbin/valkey/tree/better_replica_migration_bak @PingXie I guess we can get this change into 8.0? I test it in local and it work. |
Signed-off-by: Binbin <[email protected]>
…migration Signed-off-by: Binbin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The hostname test seems to have failed quite consistently?
Signed-off-by: Binbin <[email protected]>
the hostname test got improved in #1016, i have rebase it, hope it can solve it. |
Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Binbin <[email protected]>
Signed-off-by: Binbin <[email protected]>
Signed-off-by: Binbin <[email protected]>
Signed-off-by: Binbin <[email protected]>
Signed-off-by: Binbin <[email protected]>
@madolson @zuiderkwast do you guys wanna take a look with this one too? in case i mess up something. |
Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Binbin <[email protected]>
Signed-off-by: Binbin <[email protected]>
sorry my empty shard tests are still failing. let me disable them again to unblock others. |
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]>
We have seen more and more failures of #970 in both unstable and 8.0 so I went ahead and merged this PR. We can continue the discussion asynchronously. |
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Ping Xie <[email protected]>
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Ping Xie <[email protected]>
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Ping Xie <[email protected]>
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]>
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Ping Xie <[email protected]>
… in advance (#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue #970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes #970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: Ping Xie <[email protected]>
… in advance (valkey-io#981) Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas There is a timing bug where the primary and replica have different `cluster-allow-replica-migration` settings. In issue valkey-io#970, we found that if the replica receives `CLUSTER SETSLOT` before the gossip update, it remains in the original shard. This happens because we only process the `cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`. This commit fixes the issue by also evaluating this flag for replicas in the `CLUSTER SETSLOT` path, ensuring correct replica migration behavior. Closes valkey-io#970 --------- Signed-off-by: Binbin <[email protected]> Co-authored-by: Ping Xie <[email protected]> Signed-off-by: naglera <[email protected]>
When the primary and replica cluster-allow-replica-migration configuration items are
different, the replica migration does not meet expectations.
Originally: primary not allow, replica allow. In this case, the code will migrate the
replica away, and the primary will keep as an empty primary. However, in #970, we found
a timing issue. For example, if the replica receives CLUSTER SETSLOT first and then receives
the gossip change, it will not trigger the replica-migration.
We perform replica migration explicitly in CLUSTER SETSLOT in this case.
Closes #970.