Automatic redis monitoring and failover.
redis-sentinel is very powerful, why not use it but build a new one?
- I want to support not only redis, but also LedisDB's monitoring and failover.
- I want to embed it into another service like xcodis.
- I just want to learn how to build a distributed service using raft. :-)
go get github.com/ledisdb/redis-failover
First you must start redis and build the replication topology by yourself. e.g, 127.0.0.1:6379 is master and 127.0.0.1:6380 is slave.
redis-failover -addr=127.0.0.1:11000 -masters=127.0.0.1:6379
addr
is redis-failover HTTP listen address, masters
is the master redis server you want to monitor.
you can use a config file too, like redis-failover -config=./etc/failover.toml
. see failover.toml.
You can add master dynamically from HTTP, using httpie below:
http POST :11000/master masters==127.0.0.1:6379
redis-failover -addr=127.0.0.1:11000 -masters=127.0.0.1:6379 -raft_addr=127.0.0.1:12000 -raft_data_dir=./var0 -raft_cluster=127.0.0.1:12000 -broker=raft
raft_addr
is the raft listen address for inner raft communication. raft_data_dir
is the store path for raft, raft_cluster
is the raft cluster, here only one node.
broker
is the cluster type, now "raft" or "zk".
You must know that if you want to use raft to avoid redis-failover single point of failure, you should not use only one raft node in production.
redis-failover -addr=127.0.0.1:11000 -masters=127.0.0.1:6379 -raft_addr=127.0.0.1:12000 -raft_data_dir=./var0 -raft_cluster=127.0.0.1:12000,127.0.0.1:12001,127.0.0.1:12002 -broker=raft
redis-failover -addr=127.0.0.1:11001 -masters=127.0.0.1:6379 -raft_addr=127.0.0.1:12001 -raft_data_dir=./var1 -raft_cluster=127.0.0.1:12000,127.0.0.1:12001,127.0.0.1:12002 -broker=raft
redis-failover -addr=127.0.0.1:11002 -masters=127.0.0.1:6379 -raft_addr=127.0.0.1:12002 -raft_data_dir=./var2 -raft_cluster=127.0.0.1:12000,127.0.0.1:12001,127.0.0.1:12002 -broker=raft
raft_cluster
now contains three raft nodes, so if one node down, other two can still work correctly.
redis-failover -addr=127.0.0.1:11000 -masters=127.0.0.1:6379 -zk_addr=127.0.0.1:2181 -zk_path=/zk/redis/failover -broker=zk
zk_addr
is the remote zookeeper address, multi zk is seperated by comma.
zk_path
is the zookeeper base directory you want to save your data, the prefix must be "/zk".
redis-failover -addr=127.0.0.1:11000 -masters=127.0.0.1:6379 -zk_addr=127.0.0.1:2181 -zk_path=/zk/redis/failover -broker=zk
redis-failover -addr=127.0.0.1:11001 -masters=127.0.0.1:6379 -zk_addr=127.0.0.1:2181 -zk_path=/zk/redis/failover -broker=zk
redis-failover -addr=127.0.0.1:11002 -masters=127.0.0.1:6379 -zk_addr=127.0.0.1:2181 -zk_path=/zk/redis/failover -broker=zk
After you start redis-failover and set master redis, redis-failover will check it automatically. After it finds the master is down, it will do failover, the failover step is:
- Elect a slave which has the most up-to-date data with master to the candidate, use
INFO REPLICATION
to check. - Promote the candidate to the master, use
SLAVEOF NO ONE
. - Let other slaves replicate from the new master, use
SLAVEOF new_master_host new_master_port
.
redis-failover will log some messages for failover, like:
[2015/02/10 14:10:18] group.go:64 [Error] do ROLE command for 127.0.0.1:6379 error: EOF, try again
[2015/02/10 14:10:18] group.go:56 [Error] dial 127.0.0.1:6379 error: dial tcp 127.0.0.1:6379: connection refused, try again
[2015/02/10 14:10:18] group.go:56 [Error] dial 127.0.0.1:6379 error: dial tcp 127.0.0.1:6379: connection refused, try again
[2015/02/10 14:10:18] app.go:166 [Error] check master 127.0.0.1:6379 err Node is down, do failover
[2015/02/10 14:10:18] group.go:259 [Info] select slave 127.0.0.1:6380 as new master, priority:100, repl_offset:29
If the failover failed, redis-failover will stop to check this redis to avoid future unexpected errors, so at that time, you may fix it manually by yourself.
- Redis version >= 2.8.12, redis-failover will use redis
ROLE
command to fetch the replication topology from master.
Email: [email protected]