You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been been working on how to operate Autopilot Pattern apps across multiple data centers (geographically distinct data centers connected over a WAN). In Consul, that led to a data center naming question autopilotpattern/consul#23, and others.
As I explore how to do this in MySQL (using autopilotpattern/wordpress#27 as the scenario), I'm trying to determine the importance of data center awareness. On the one hand, it's important to have a solid strategy for recovering from complete data center failures. On the other, the risk of split brain scenarios grows dramatically over a WAN.
For the purpose of this question and the scenario in autopilotpattern/wordpress#27, let's assume a standard master-replica replication topology (not multi-master, not sharded).
From a data center that's remote from the primary, how can we determine the difference between a failure of the primary, the failure of the entire data center the primary is in, or a network partition of the two data centers?
The text was updated successfully, but these errors were encountered:
Before we can even start to answer that question we need to know what kind of replication we're talking about here. Are we talking master-master (which is super dodgy with MySQL), sharded-master, or master-replica over the WAN?
need to know what kind of replication we're talking about
Fair point. I'm assuming master-replica only for now. We're not really making much effort to support other topologies in this implementation, but we should at least ask the question.
The scenario I'm using to explore this is further addressed in autopilotpattern/wordpress#27. I'll update the story at the top to clarify the intended topology.
I've been been working on how to operate Autopilot Pattern apps across multiple data centers (geographically distinct data centers connected over a WAN). In Consul, that led to a data center naming question autopilotpattern/consul#23, and others.
As I explore how to do this in MySQL (using autopilotpattern/wordpress#27 as the scenario), I'm trying to determine the importance of data center awareness. On the one hand, it's important to have a solid strategy for recovering from complete data center failures. On the other, the risk of split brain scenarios grows dramatically over a WAN.
For the purpose of this question and the scenario in autopilotpattern/wordpress#27, let's assume a standard master-replica replication topology (not multi-master, not sharded).
From a data center that's remote from the primary, how can we determine the difference between a failure of the primary, the failure of the entire data center the primary is in, or a network partition of the two data centers?
The text was updated successfully, but these errors were encountered: