Docker + OKE: Running GPU Inference Containers on Oracle Cloud

1 / 2

Docker + OKE: Running GPU Inference Containers on Oracle Cloud

DEV Community·Pavan Madduri·24 days ago

#anL9aAhb

#oci #oke #gpu #docker #inference #model

Reading 0:00

15s threshold

I wanted to deploy an LLM inference API without spending $1,200/month on AWS GPU instances. OCI turned out to be significantly cheaper, and the Docker workflow was identical. Here's what I set up. Why I Looked at OCI for GPU Workloads I've been building GPU infrastructure tools for a while now (keda-gpu-scaler, otel-gpu-receiver, GPU NUMA scheduling for Volcano), and most of my testing was on AWS. The g5.xlarge instances with A10 GPUs run about $1.01/hr, plus $73/month for the EKS control plane. It adds up fast when you're iterating. Someone on the Volcano Slack mentioned OCI's GPU pricing and I was skeptical. But when I looked it up, the numbers were real — same A10 GPU, roughly 40% cheaper, and OKE doesn't charge for the Kubernetes control plane at all. So I tried moving a vLLM inference workload over. OCI GPU Pricing Here's what OCI actually charges for GPU instances.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Docker + OKE: Running GPU Inference Containers on Oracle Cloud