-
Notifications
You must be signed in to change notification settings - Fork 26
Design
Kinshuk Bairagi edited this page Dec 16, 2022
·
11 revisions
dkv is designed with following principles, capabilities:
- Ability to distribute copies of data across cluster members. This data distribution is for two primary purposes:
- Improve data durability guarantees by having multiple copies of data, made consistent using consensus protocols. dkv uses Raft via the Nexus project for data replication across DCs.
- Scale read throughout of data by replicating to multiple Follower(s)(typically across DCs) that are used for reads or promoted to Leader, and Slave(s) used primarily to serve reads.
- A simplified runtime that combines into a single OS process concerns like Service interface, storage, replication mechanism (using distributed consensus).
- Ability to incrementally add additional features like Sharding, Data Structures - e.g. queues, as the system evolves.
- Efficient API interface for remote clients(using gRPC) and network abstraction sidecar processes(using Envoy).
- Support expansion of an existing cluster by adding new dkv nodes, restore data on existing nodes from backups (with some caveats on consistency)
- Provide API interface for performing most commonly used KV operations such as - Put, Get, Iterate.
The below diagram depicts a cluster setup across 2 DCs simulating 3 availability zones that supports Linearizable, Sequential consistent reads . Also demonstrates data replication within a DC using the CDC(Change Data Capture) mechanism to support scaling of data reads offering Eventual consistency.