sbt112321321

Author Profile

Claim This Author Profile

Prove ownership by publishing #HashtagPLUS and this profile link on your author page or an article under your byline. A moderator or admin will review the request before it merges into your real HashtagPLUS username.

0 karma0 postsjoined 23 days ago

{"title": "How I Cut My LLM Inference Costs by 40% While Handling 5x More Reques

🌐 dev.toSource

From Dev.to - python: {"title": "How I Cut My LLM Inference Costs by 40% While Handling 5x More Reques

#ai #tutorial #python #api #inference #openai #model #deepseek

20 days ago

{"title": "How to stream reasoning tokens from an LLM in production: a practical

🌐 dev.toSource

From Dev.to - python: {"title": "How to stream reasoning tokens from an LLM in production: a practical

#ai #tutorial #python #api #reasoning #print #streaming #json

20 days ago

From Cold Starts to Hot Paths: How I Cut LLM Inference Latency by 40% with a Simple Routing Trick

🌐 dev.toSource

From Dev.to - python: From Cold Starts to Hot Paths: How I Cut LLM Inference Latency by 40% with a Simple Routing Trick

#ai #tutorial #python #api #model #json #cold #session

20 days ago

{"title": "Bending the Cost Curve: How I Slashed My LLM Inference Bill by 70% Wh

🌐 dev.toSource

From Dev.to - python: {"title": "Bending the Cost Curve: How I Slashed My LLM Inference Bill by 70% Wh

#ai #tutorial #python #api #reasoning #need #inference #model

20 days ago

{"title": "Bending the Cost Curve: How I Slashed My LLM Inference Bill by 70% Wh

🌐 dev.toSource

From Dev.to - python: {"title": "Bending the Cost Curve: How I Slashed My LLM Inference Bill by 70% Wh

#ai #tutorial #python #api #reasoning #need #inference #model

20 days ago

How I Cut My LLM Inference Costs by 40% While Keeping the Same Performance

🌐 dev.toSource

From Dev.to - python: How I Cut My LLM Inference Costs by 40% While Keeping the Same Performance

#ai #tutorial #python #api #time #token #latency #requests

20 days ago

Sharing a simple Python script to benchmark LLM inference latency across different providers

🌐 dev.toSource

From Dev.to - python: Sharing a simple Python script to benchmark LLM inference latency across different providers

#ai #tutorial #python #api #time #providers #print #streaming

20 days ago

I benchmarked three LLM inference providers this week and one route surprised me

🌐 dev.toSource

From Dev.to - python: I benchmarked three LLM inference providers this week and one route surprised me

#ai #tutorial #python #api #inference #latency #token #providers

20 days ago

**Title:** I benchmarked three LLM inference providers this week and one route s

🌐 dev.toSource

From Dev.to - python: **Title:** I benchmarked three LLM inference providers this week and one route s

#ai #tutorial #python #api #inference #latency #token #providers

20 days ago

Been testing different inference backends lately.

🌐 dev.toSource

From Dev.to - ai: Been testing different inference backends lately.

#ai #webdev #productivity #programming #novastack #simplest #token #marketplace

21 days ago

Menu

sbt112321321