Byzantine Fault Tolerance handles nodes that lie, not just nodes that crash. The cost is real (3f+1 nodes to tolerate f Byzantine failures, more message rounds). The benefit is real too — but only when your threat model genuinely includes lying nodes.
The Byzantine fault model
Crash failures: node stops. Byzantine: node sends arbitrary messages — could be malicious, could be corrupted memory, could be buggy. To tolerate f Byzantine nodes you need 3f+1 total (vs 2f+1 for crash-only). The extra node-count is the floor cost.
PBFT and its descendants
Practical Byzantine Fault Tolerance (Castro & Liskov, 1999) was the first viable production BFT. Three-phase commit (pre-prepare, prepare, commit). O(n²) messages per decision. Newer variants (HotStuff) reduce this to O(n).
Tendermint and HotStuff
Tendermint (Cosmos): leader-based BFT with timeouts. HotStuff (Libra/Diem): linear message complexity, pipelined. These are the modern blockchain-targeted BFT protocols. Production-quality, well-audited.
Where BFT earns its cost
Public blockchains (untrusted participants). Cross-org consortium systems (members don't fully trust each other). Critical infrastructure where adversarial node compromise is in scope (defense, financial settlement). Rarely needed in a single-org internal system.
Where to skip
Inside one trust domain (your company's services): use Raft, not BFT. The cost-benefit doesn't work. If you're seeing 'BFT' in an internal architecture proposal, push back — usually it's cargo culting from the blockchain world.