Three protocols dominate production distributed consensus: Raft, Multi-Paxos, and Zab. They solve the same problem with different design trade-offs. Knowing the differences helps you read the source of any distributed system you depend on.

Advertisement

Why we have so many

Paxos was correct but notoriously hard to understand. Raft was designed for understandability with equivalent correctness. Zab predated Raft inside Apache ZooKeeper. Multi-Paxos is the production form of Paxos as used in Spanner and Chubby. Same goal, different journeys.

Raft for understandability

Leader-based, log-replication-centric, randomized election timeouts. The three-state model (follower, candidate, leader) and the log-up-to-date check are easy to reason about. Used in etcd, Consul, CockroachDB, TiKV. The default for new systems built in 2020+.

Advertisement

Multi-Paxos in production

Reuses a single leader across many decisions (avoiding per-decision elections). The Spanner heritage proves it works at planetary scale. More complex than Raft to implement correctly; many production codebases are decade-old optimized variants.

Zab and ZooKeeper

ZooKeeper Atomic Broadcast. Leader-based with primary order: every state change goes through the leader. Strong ordering guarantees, used for coordination/metadata workloads. Older than Raft; ZooKeeper is being supplanted by Raft-based etcd in many deployments.

Choosing in 2026

New system: use Raft. Library: etcd/raft in Go, raft-rs in Rust, jraft in Java. Don't roll your own; correctness bugs in consensus code are silent and devastating. Reach for Paxos only if you have a specific reason (existing deployment, particular optimization).

Raft for new code. Multi-Paxos at planetary scale where it's already deployed. Zab is legacy. Use a library; never implement from scratch.