spent a few months figuring this out properly so figured i’d write it up the RTX 5090 availability problem is weirder than it looks. every provider lists it on their pricing page. actually getting one when you need it, at the node quality you need, during a demand spike, is a different question entirely. my context: distributed inference, 70B class models, need multiple nodes running simultaneously, cannot have a node fail mid-job and require manual recovery. also not interested in buying and racking hardware. what i tested AWS and Azure technically have high-end GPU access but on-demand RTX 5090 is painful in practice. you’re either waiting, on a waitlist, or paying for reserved capacity you don’t want to commit to before you know your demand shape. the provisioning time alone makes it hard for anything elastic. Vast.ai has RTX 5090 and the price is often the lowest you’ll find.…