About a year ago, the consensus runtime I'd been building started doing something annoying. The setup was straightforward: a Tendermint-style chained BFT with five masternodes finalising blocks proposed by a rotating set of lightnodes, partitioned into committees of 5-10 nodes each (we call them "groups"). The design was textbook. The implementation worked fine on a single machine, fine on two machines in the same datacenter, fine on three machines across two regions. Then we put it on a real testbed — four VMs across three geographic regions (US-East, EU-Central, EU-North), 26 masternodes, 115 lightnodes — and started pushing realistic load through it. About 10³ transactions per second, distributed across four RPC endpoints, sustained. And about every third group-formation transition, the BFT certificate would stall.…