Collection of papers that I have personally found helpful for system design (updating).
Dissecting Dynamo: Highly available key value store at Silicon Valley Code Camp 2018
USENIX ATC '13 - TAO: Facebook’s Distributed Data Store for the Social Graph
Scaling Memcache at Facebook
Turning Caches into Distributed Systems with mcrouter - Data@Scale
Large-Scale Low-Latency Storage for the Social Network - Data@Scale
Dataswarm
Large-scale cluster management at Google with Borg
Scaling YouTube's Backend: The Vitess Trade-offs - @Scale 2014 - Data
GOTO 2016 • What is a CDN and why Developers should Care about using one
Scaling Uber's Real-time Market Platform
The Paxos Algorithm
CONSOLIDATING STORAGE BACKENDS AT SCALE WITH TECTONIC FILESYSTEM
Wormhole vs Scribe
- data loss: Scribe is tailored for high availability and may lose data in case of machine failures; Wormhole has minimal data loss.
- order of updates: Wormhole maintains oder; Scribe may deliver updates out of order.
- latency: Wormhole has a lower latency than Scribe.
- intermediate broker: Scribe uses broker; Wormhole does not, and Wormhole publishers are specialized for different data systems.
Paxos: concensus algorithm
- Chubby: Google's lock service
- Zookeeper
Dataswarm: dependency graph description language
- fault tolerance
- selectivity
- remote execution
Kubernetes, Borg, Tupperware
- robust to failures
- multiple regions
- monitoring
- consistent environments
- updates
BigTable
- write append
- Sstable
- bloom filter
Kafka
- message queuing system with twists (could work as low-latency message queues, or log aggregator)
- high throughput, thanks to sequential I/O, and fewer data copies / system calls
- pull model: consumer can rewind back and re-consume data
- pulisher-subscriber patterns
Cassandra
- consistent hashing: move nodes on the ring to adjust load
- data model: column family
- replication policy: next N-1 nodes on the ring
- local persistence: commit logs, Sstable, bloom filter (similar to bigtable)
Dynamo
- consistent hashing: virtual nodes
- data model: key-value store
- replication policy: next N-1 nodes on the ring
QPS
- web server, SQL database: 1K
- NoSQL database (e.g. Cassandra): 10K
- Mechached: 100K
News feed (such as Facebook News Feed)
- pull vs push
- fanout on write (push): news feed is generated in real-time, fetching news feed is fast
- store <user_id, post_id> in news feed table
- fanout on read (pull): no resource waste for inactive users, no hote key proble (celebrity)
- pull friends' posts from DB when user loads news feed
- hybrid: push model for most friends of the user, pull model for celebrities followed by the user
- fanout on write (push): news feed is generated in real-time, fetching news feed is fast
Chat system (such as Messenger)
- SQL database for user generic data
- thread table: primary key=(thread_id, owner_id)
- NoSQL database for chat messages storage
- message table: row_key=thread_id
- prefer websocket over polling and long polling for client-server connection
- chat service is stateful with persistent network connection through websocket (HTTp based servers are usually stateless)
- Online status: pull model, pub-sub