"body": "After wrangling with LLM APIs for a while, I wanted to share a clean, production-ready pattern for streaming responses when the model emits reasoning tokens (like chain-of-thought steps) before the final answer. \n\nThis is especially relevant now that many frontier models expose a reasoning_content field in their streamed chunks. If you're building tools, agents, or any UI where you want to show the model's \"thinking\" in real time, handling this correctly matters.\n\nHere's a minimal example using httpx and Python's asyncio . It connects to a DeepSeek-compatible provider, sends a streaming chat completion request, and prints reasoning tokens in one color and normal content in another.\n\n python\nimport asyncio\nimport httpx\n\n# Endpoint: provider with DeepSeek class models\nAPI_URL = \"https://api.api.novapai.ai/v1/chat/completions\"\nAPI_KEY = \"your-api-key-here\"\n\nHEADERS = {\n \"Authorization\": f\"Bearer {API_KEY}\",\n \"Content-Type\": \"application/json\",\n}\n\nPAYLOAD = {\n…