Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does learner become follower? #457

Closed
sargarass opened this issue Oct 11, 2021 · 6 comments
Closed

How does learner become follower? #457

sargarass opened this issue Oct 11, 2021 · 6 comments
Labels
Question A question to be answered.

Comments

@sargarass
Copy link

sargarass commented Oct 11, 2021

From my tests there is no automatic transmission from learner state to follower in v0.6. How to make it right then? I suppose it could be done before ticking the leader's raft.

  1. Does Raft-rs track the learners status or should I do it by using ConfState after apply_conf_change? Etcd's ProgressTracker has field IsLearner bool.

  2. Is it enough for the leader to check ProgressTracker of the learners to ensure they match his commited_index (learner's progress.committed_index == leader's raft.raft_log.committed && learner's ProgressState != Snapshot)?
    Or maybe a learner's state == ProgressState::Replicate is sufficient?

  3. Should the leader propose new configuration with exactly one node transmitted from learner to follower at a time or one ConfChangeV2 with all the learner ids and transition = Implicit will do the thing?

  4. Is it possible that in the future this transition will be made by raft-rs?

@BusyJay
Copy link
Member

BusyJay commented Oct 12, 2021

  1. Does Raft-rs track the learners status or should I do it by using ConfState after apply_conf_change?

Raft-rs tracks the status in configuration. It's complicated whether a peer is learner or not when considering joint state. I suggest application also keep tracking the confstate.

  1. Is it enough for the leader to check ProgressTracker of the learners to ensure they match his commited_index

It depends on what do you want. If you just want to see if it's safe to promote a learner to voter, you can check the implementation in TiKV, I think it's the most suitable way without depending on much details of raft-rs.

  1. Should the leader propose new configuration with exactly one node transmitted from learner to follower at a time or one ConfChangeV2 with all the learner ids and transition = Implicit will do the thing?

Either way is OK. It depends on how much you want to control the process. TiKV choose to use ConfChangeV2 to promote a learner and demote a voter at the same time with transition set to explicit. You may want to check out https://github.com/tikv/rfcs/blob/master/text/0054-joint-consensus.md to see how TiKV adapts joint consensus.

  1. Is it possible that in the future this transition will be made by raft-rs?

I'm afraid no. Learner is a standalone role that can performs without becoming a voter. So raft-rs will not promote it automatically. For example, TiDB uses voters and learners at the same time to achieve isolated HTAP with strong consistency.

@BusyJay BusyJay added the Question A question to be answered. label Oct 12, 2021
@sargarass
Copy link
Author

sargarass commented Oct 12, 2021

@BusyJay, thanks for the reply!

If you just want to see if it's safe to promote a learner to voter, you can check the implementation in TiKV

1. let promoted_commit_index = after_progress.maximal_committed_index().0;
2. if current_progress.is_singleton() // It's always safe if there is only one node in the cluster.
3.    || promoted_commit_index >= self.get_store().truncated_index()
4. {
5.    return Ok(());
6. }

So for safety:
0. Does it happen before applying new configuration or before proposing it?

  1. Does it check whenever new quorum would have maximal_committed_index >= current leader's commit_index in line 3?
  2. Why is promoted_commit_index >= current_progress.maximal_committed_index().0 not used instead of line 3?
  3. What is maximal_committed_index.1 bool used for? Is it safe to ignore it?
  4. Why is the maximum used, not just the quorum's commited_index?
  5. Line 2 is not obvious. Let's assume that there are several learners way behind our 1 node-cluster. Would not it cause data-loss/other problems if they are promoted to voters and then the leader immediately fails?

@BusyJay
Copy link
Member

BusyJay commented Oct 12, 2021

If the commit index become smaller than the leader's truncated index after applying the configuration change, then leader will have to send snapshot to at lease one node to make quorum catch up enough logs. Snapshot is slow and it will pause the whole group.

Checking leader's commit index is a stricter constraint, which may not be possible in all conditions. For example, a fast voter may never be removed with such requirement.

maximal_committed_index.1 bool used for? Is it safe to ignore it?

It's for group commit, which is an extension, you can safely ignore it generally.

Let's assume that there are several learners way behind our 1 node-cluster. Would not it cause data-loss/other problems if they are promoted to voters and then the leader immediately fails?

What if leader does nothing and fails? It's not the problem that multiple nodes are being promoted but the fact that there is only one healthy node in the group. We consider singleton is dangerous, we want to add multiple replicas as soon as possible. The comment is not accurate though.

@sargarass
Copy link
Author

sargarass commented Oct 12, 2021

I think I have almost everything needed.
What is leader's truncated index tho, how do I get it?

Edit: Is it the index of latest entry in the latest snapshot? (Therefore, we do not need to send a snapshot after this condition is met)

@BusyJay
Copy link
Member

BusyJay commented Oct 13, 2021

It's the minimal index of available logs minus one.

@sargarass
Copy link
Author

@BusyJay, appreciate for help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Question A question to be answered.
Projects
None yet
Development

No branches or pull requests

2 participants