Two-Phase Commit gets a bad rap as 'the blocking protocol'. Used as a general-purpose distributed transaction mechanism, that reputation is earned. Used narrowly — short transactions, limited participants, with a robust coordinator — it remains the right tool for specific jobs.
The protocol
Phase 1 (prepare): coordinator asks all participants 'can you commit?'. Each writes a tentative log entry and replies yes/no. Phase 2 (commit/abort): if all yes, coordinator says commit; participants apply and ack. If any no or timeout, abort.
The blocking problem
If coordinator crashes after phase 1 but before phase 2, participants are blocked holding locks until coordinator recovers. With many participants and slow coordinators, this can cascade.
Where 2PC still fits
Cross-shard transactions in a single trust domain (e.g., CockroachDB, Spanner). Short critical-section transactions (<100ms) with bounded participants (2-5). Backed by a Raft-replicated coordinator for high-availability of the coordinator.
Saga as the alternative
Long transactions across services with retry/compensation. Each step has an inverse. Used by checkout flows, payment processing. No blocking; weaker isolation (intermediate states visible). The right shape when 2PC's locks would cause too much contention.
Per-message idempotency
Whatever protocol you pick, make participants idempotent. Coordinators crash, retries happen, network duplicates messages. An operation that's safe to repeat survives all of this; an operation that isn't requires careful 'have I done this?' tracking.