Skip to content

utkarshsingh99/ShardPaxosBank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ShardPaxosBank - A Fault-Tolerant Distributed Transaction System

This project implements a fault-tolerant distributed transaction processing system designed to support a simple banking application.

The system incorporates key distributed systems concepts such as sharding, replication, Paxos consensus algorithm, and the two-phase commit protocol to achieve scalability, fault tolerance, and correctness in transaction handling.

Features

Data Sharding and Replication:

Data is partitioned across multiple shards, each replicated across servers within a cluster. Fault tolerance is achieved by ensuring data availability even if one server in a cluster fails.

Screenshot 2025-01-19 at 10 50 25 PM Screenshot 2025-01-19 at 10 51 39 PM

Transactional Support:

Intra-Shard Transactions: Access data within a single shard and are processed using a modified Paxos protocol.
Cross-Shard Transactions: Access data across multiple shards and are handled using the two-phase commit (2PC) protocol with Paxos for intra-shard consensus.

Consensus Mechanisms:

Modified Paxos Protocol: Ensures fault-tolerant ordering and execution of transactions within a shard.
Two-Phase Commit Protocol: Coordinates cross-shard transactions to maintain atomicity and consistency across clusters. Screenshot 2025-01-19 at 10 51 05 PM

Fault Tolerance:

Handles fail-stop failure models with robust locking mechanisms. Transactions abort in scenarios like insufficient balances, lock contention, or lack of quorum during consensus.

Performance Metrics:

Measures throughput (transactions per second) and latency (time to process a transaction).

Key Distributed Systems Concepts

  1. Consensus Protocols: Paxos and Multi-Paxos for intra-shard consistency.
  2. Atomicity and Consistency: Achieved using two-phase commit for cross-shard transactions.
  3. Fault Tolerance: Replication within clusters and recovery mechanisms like write-ahead logs (WAL).
  4. Concurrency Control: Two-phase locking to manage concurrent transactions and prevent conflicts.
  5. Scalability: Partitioning and replication ensure the system can handle large-scale datasets and distributed workloads.

Skills I developed while working on this project

  1. Distributed systems design: sharding, replication, consensus protocols.
  2. Implementation of fault-tolerant protocols like Paxos and 2PC.
  3. Concurrency handling with locks and conflict resolution.
  4. RPC Communications using net/rpc package.

Outputs:

Provides the balance of any data item across all servers.
Logs committed transactions for debugging and auditing.
Performance metrics of each node in the system.

The following output is from a tmux session running 9 nodes as separate processes on different ports. Screenshot 2025-01-20 at 12 57 15 AM Screenshot 2025-01-20 at 12 58 13 AM

The full spec doc of this project can be found here.

About

Fault-Tolerant Distributed Transaction Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages