-
Notifications
You must be signed in to change notification settings - Fork 504
support 2 data centres HA with no central ZooKeeper cluster (or 2 single JVMs which can restart themselves with no fabric) #622
Comments
So the main 2 pieces to solve this would be
|
the ZK bridging my wish to use this: though we need to make sure we don't mess up any of the existing semantics (e.g. the way the ZK cluster stuff works for master / slave and the like) |
was chatting on IM with @chirino recently; he made the excellent point that (paraphrasing badly here...) Most folks have been dealing with the 2 Data Centre failover scenario for a while by making one a known master; the main data centre (so run 2 ZKs there). If the other DC fails its fine. If the master fails, its a manual process to create a new ZK server so that the backup DC can take over. This is easily achievable now really; i.e. use 3 ZK nodes; favour DC A which has more ZKs; then DC B. Then B can fail and things just work. If A fails, then its a manual process to configure DC B to use just a single ZK instance (rather than assume, say, 3 ZK servers with A) when its been manually established that really A is down and its not a network split |
Quoting ZooKeeper documentation:
The main reason for this is that EVERY zookeeper write operation requires to voted from the servers. So splitting the servers in different datacenters will dramatically impact the performance. So I guess that we should always have all zookeeper servers run in a single datacenter (the master one) and then have an observer running at each client datacenters. Let's assume datacenters A and B and datacenter A hosts the ensemble. If the 2 datacenters get disconnected the B will loose communication with ZooKeeper and no read write operation would be possible. A will continue to work properly. What is our goal here? |
When folks are happy with a single master data centre that seems fine. It seems common for folks to have 2 or 3 data centres where any data centre Am thinking of things like:
For the last 2 items I wondered if replicating some state between DCs into In the case of AMQ it'd be to advertise brokers in remote DCs for network Another option would be to just force all the relevant code in fabric8 to On Friday, May 30, 2014, Ioannis Canellos [email protected] wrote:
JamesRed Hat Twitter: @jstrachan hawtio: http:/ http://fusesource.com//hawt.io/ Open Source Integration |
If we want to combine fabric, for purely management purposes I guess we could have multiple isolated fabrics (1 per DC) that would talk to each other via some short of bridge. We could use this bridge to push / pull profile configuration between DC and also publish the local DC container attributes to the remote datacenter. As long as we don't try to perform inter-DC locking / leader election, it should work ok. |
Agreed. Am mostly just thinking its a way to do cross DC configuration, To help avoid folks accidentally doing multi-DC leader election stuff; we On Friday, May 30, 2014, Ioannis Canellos [email protected] wrote:
JamesRed Hat Twitter: @jstrachan hawtio: http:/ http://fusesource.com//hawt.io/ Open Source Integration |
This is a very interesting use case for me. I have been thinking about building a distributed monitoring and data collection system on top of fabric8 for a large retail customer. The want to have monitoring and data collection in each branch because the branch internet can be flaky (for example they can go down when the store closes). My objective was to be able to roll out new services into new containers in the branches via fabric8 but I was uncertain how zookeeper handled the flaky internet. Is it possible to solve this scenario today? If not how could my team and I get started worked toward making this work? If it helps to make anything simpler I suspect would only be configuration changes initiated in the data center and pushed to the stores as they connected and retrieved them |
I wonder this is on our table for v2 when we stand on top of kube / OS3 ? |
its looking like kubernetes is going to use "ubernetes" to solve the multi-data centre problem: |
it'd be awesome to be able to support fabric8 when folks have exactly 2 data centres and either can go down; or when they have just 2 JVMs and they want to implement a master/slave broker pair.
In either case, there's 2 things which either could fail. Since there's only 2, there's no chance of quorum.
Another similar use case is the retail scenario; where there the internet connection could be flaky, so joining a remote fabric on startup is not feasible.
However in these cases, the remote DC / container could startup its own fabric; then just discover (when a fabric is available) the master git repo and push/pull to it - but still startup if there's no remote fabric.
This would mean that the single DC or single container would basically be a stand alone separate fabric; it'd just be able to share configurations across fabrics. So this idea is a little bit like store and forward with ActiveMQ; you could wire together 2 fabrics; so when available a fabric bridge could connect to the remote fabric ZK and copy relevant data into the local ZK cluster (using ephemeral nodes so it goes away if the remote ZK fabric disappears).
An added benefit of this approach is then things can restart even when there's no ZK - providing the previously started up so that they have a git repo and the like; its just they can't sync any changes or discover any remote services on the remote fabric.
The text was updated successfully, but these errors were encountered: