Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal better with network partitions in leaders #2749

Merged
merged 8 commits into from
Nov 13, 2018
Merged

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented Nov 13, 2018

Currently, if Zero leader goes in a network partition, the Alpha nodes would get stuck indefinitely waiting to hear updates from the leader, hence the zero leader becoming a single point of failure. After a network heal, it takes a while for the partitioned nodes to get better again.

This PR fixes these issues, by:

  • Each Zero leader now sends membership updates every second.
  • If an Alpha does not get a membership update for over 10s, it disconnects from the leader and tries to recreate the connection to another (any) Zero server. Thus, every alpha would correctly pick up the new membership state and hence, the new Zero leader.
  • Oracle Delta Stream: If the Zero leader changes or Zero connection becomes unhealthy, Alpha leader would disconnect from the current leader, and try to recreate connection to the new one. Thus, it would continue to receive updates correctly.
  • Connection Pool: It used to poll every 10s, with no timeout. Changed that to poll every 1s, with a timeout of 1s -- so we get to know about connection health issues quicker. This creates more network traffic (one Echo every 1s, N^2, where N = number of servers in the Dgraph cluster), but if and when that becomes a problem, we'll fix it.

This change is Reviewable

…Need to work on ensuring that Zero is sending update every second or so.
… leader is partitioned away from the cluster.
…ero and alpha leaders, increments converge quickly to the new ones. And when partition heals, both of them heal quickly.
@manishrjain manishrjain merged commit e7170c3 into master Nov 13, 2018
@manishrjain manishrjain deleted the mrjn/zero-partition branch November 13, 2018 18:58
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
Currently, if Zero leader goes in a network partition, the Alpha nodes would get stuck indefinitely waiting to hear updates from the leader, hence the zero leader becoming a single point of failure. After a network heal, it takes a while for the partitioned nodes to get better again.

This PR fixes these issues, by:
- Each Zero leader now sends membership updates every second.
- If an Alpha does not get a membership update for over 10s, it disconnects from the leader and tries to recreate the connection to another (any) Zero server. Thus, every alpha would correctly pick up the new membership state and hence, the new Zero leader.
- Oracle Delta Stream: If the Zero leader changes or Zero connection becomes unhealthy, Alpha leader would disconnect from the current leader, and try to recreate connection to the new one. Thus, it would continue to receive updates correctly.
- Connection Pool: It used to poll every 10s, with no timeout. Changed that to poll every 1s, with a timeout of 1s -- so we get to know about connection health issues quicker. This creates more network traffic (one Echo every 1s, N^2, where N = number of servers in the Dgraph cluster), but if and when that becomes a problem, we'll fix it.

Commits:
* Added some code to cancel recv from Zero if no update for x seconds. Need to work on ensuring that Zero is sending update every second or so.
* Alpha leader can reconnect to the new Zero leader after existing Zero leader is partitioned away from the cluster.
* Fixed various partition related issues. After partitioning off both Zero and alpha leaders, increments converge quickly to the new ones. And when partition heals, both of them heal quickly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant