Menu

Post image 1
Post image 2
1 / 2
0

Docker + OKE: Running GPU Inference Containers on Oracle Cloud

DEV Community·Pavan Madduri·24 days ago
#anL9aAhb
#oci#oke#gpu#docker#inference#model
Reading 0:00
15s threshold

I wanted to deploy an LLM inference API without spending $1,200/month on AWS GPU instances. OCI turned out to be significantly cheaper, and the Docker workflow was identical. Here's what I set up. Why I Looked at OCI for GPU Workloads I've been building GPU infrastructure tools for a while now (keda-gpu-scaler, otel-gpu-receiver, GPU NUMA scheduling for Volcano), and most of my testing was on AWS. The g5.xlarge instances with A10 GPUs run about $1.01/hr, plus $73/month for the EKS control plane. It adds up fast when you're iterating. Someone on the Volcano Slack mentioned OCI's GPU pricing and I was skeptical. But when I looked it up, the numbers were real — same A10 GPU, roughly 40% cheaper, and OKE doesn't charge for the Kubernetes control plane at all. So I tried moving a vLLM inference workload over. OCI GPU Pricing Here's what OCI actually charges for GPU instances.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More