System Design Papers

data loss: Scribe is tailored for high availability and may lose data in case of machine failures; Wormhole has minimal data loss.
order of updates: Wormhole maintains oder; Scribe may deliver updates out of order.
latency: Wormhole has a lower latency than Scribe.
intermediate broker: Scribe uses broker; Wormhole does not, and Wormhole publishers are specialized for different data systems.

Paxos: concensus algorithm

Chubby: Google's lock service
Zookeeper

Dataswarm: dependency graph description language

fault tolerance
selectivity
remote execution

Kubernetes, Borg, Tupperware

robust to failures
multiple regions
monitoring
consistent environments
updates

BigTable

write append
Sstable
bloom filter

Kafka

message queuing system with twists (could work as low-latency message queues, or log aggregator)
high throughput, thanks to sequential I/O, and fewer data copies / system calls
pull model: consumer can rewind back and re-consume data
pulisher-subscriber patterns

Cassandra

consistent hashing: move nodes on the ring to adjust load
data model: column family
replication policy: next N-1 nodes on the ring
local persistence: commit logs, Sstable, bloom filter (similar to bigtable)

Dynamo

consistent hashing: virtual nodes
data model: key-value store
replication policy: next N-1 nodes on the ring

QPS

web server, SQL database: 1K
NoSQL database (e.g. Cassandra): 10K
Mechached: 100K

Examples

News feed (such as Facebook News Feed)

pull vs push
- fanout on write (push): news feed is generated in real-time, fetching news feed is fast
  - store <user_id, post_id> in news feed table
- fanout on read (pull): no resource waste for inactive users, no hote key proble (celebrity)
  - pull friends' posts from DB when user loads news feed
- hybrid: push model for most friends of the user, pull model for celebrities followed by the user

Chat system (such as Messenger)

SQL database for user generic data
- thread table: primary key=(thread_id, owner_id)
NoSQL database for chat messages storage
- message table: row_key=thread_id
prefer websocket over polling and long polling for client-server connection
- chat service is stateful with persistent network connection through websocket (HTTp based servers are usually stateless)
Online status: pull model, pub-sub

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Amazon		Amazon
Facebook		Facebook
Google		Google
Linkedin		Linkedin
Microsoft		Microsoft
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

System Design Papers

System design talks

My personal notes / ideas on papers

Examples

About

Releases

Packages

liyinxiao/System_Papers

Folders and files

Latest commit

History

Repository files navigation

System Design Papers

System design talks

My personal notes / ideas on papers

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages