Skip to content

Data, Design Principles

Regunath B edited this page Nov 18, 2020 · 2 revisions

Most distributed data stores can be critiqued by the CAP Theorem and consequently position themselves as CP or AP systems. The requirements that motivated dkv (as described here : Motivation, Needs) indicate the need for Consistency vs Availability tradeoffs on a case-by-case basis.

dkv, therefore, adheres to these data, design principles:

  • Provide ability for dkv to be deployed in a manner that permits its clients to specify consistency requirements during deployment - for e.g. Raft based cluster(strongly consistent Quorum writes) vs. Master-Slave (Eventually consistent asynchronous writes) cluster.
  • Provide API to read data by specifying required data consistency, for e.g. get(K,consistency)

Each dkv data node is an independent runtime that serves data read, write requests. Data consistency checks happen during writes (Raft quorum) while the read APIs only determine the node to access in order to read the data (e.g. Raft Leader for Linearizable, Raft follower for Sequentially consistent or Slave for Eventually consistent reads). The state managed by each dkv node is intentionally kept independent such that on a cluster of any size, the dkv runtime will serve a data read request so long as at least 1 node is alive. Of course the data set will be limited to what is hosted on that node (in a sharded setup) and data consistency will at best be eventual (e.g. if the live node is a Slave). dkv clients may therefore make a real-time choice of consistency - for e.g. call get(K,dkv.LINEARIZABLE) and fallback to get(K,dkv.EVENTUALLY_CONSISTENT) if they can tradeoff consistency for read availability.

Clone this wiki locally