You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have encountered a problem on cloud virtualized storage that the rpc Raft.Stats called from consul may periodically stall. As a consequence consul leader emits to log messages about not healthy followers at the same time leaving the cluster is safe.
The primary investigation has revealed that Raft.Stats stalls on getting configuration of raft node (ConfigurationFuture wrapper) that deals with follower loop (through configurationsCh channel inside runFollower routine). AFAIC each request to Follower including heartbeats and raft RPCs are handled sequentially. The output of RPC time metrics have shown that the appendEntries rpc or, more precisely storeLogs stage has wide spread of latency. It happens because of not stable synchronization of logs latency to persistent storage (fdatasync syscall in BoltDB storage backend). And separate measurement of fdatasync latency confirmed this hypothesis.
It's a problem of cloud provider. But such issue exposes the looseness of architecture - lightweight read requests to Follower have to wait block ones. What about not blocking reads based on snapshot before commit of logs (such as in MVCC scheme)? Is it possible and could be implemented?
The text was updated successfully, but these errors were encountered:
I have encountered a problem on cloud virtualized storage that the rpc
Raft.Stats
called fromconsul
may periodically stall. As a consequence consul leader emits to log messages about not healthy followers at the same time leaving the cluster is safe.The primary investigation has revealed that
Raft.Stats
stalls on getting configuration of raft node (ConfigurationFuture
wrapper) that deals with follower loop (throughconfigurationsCh
channel insiderunFollower
routine). AFAIC each request to Follower including heartbeats and raft RPCs are handled sequentially. The output of RPC time metrics have shown that theappendEntries
rpc or, more preciselystoreLogs
stage has wide spread of latency. It happens because of not stable synchronization of logs latency to persistent storage (fdatasync
syscall in BoltDB storage backend). And separate measurement offdatasync
latency confirmed this hypothesis.It's a problem of cloud provider. But such issue exposes the looseness of architecture - lightweight read requests to Follower have to wait block ones. What about not blocking reads based on snapshot before commit of logs (such as in MVCC scheme)? Is it possible and could be implemented?
The text was updated successfully, but these errors were encountered: