We had a feature in production where a single user request could run for five-plus minutes — fetch documents, chunk them, hit an LLM per chunk, synthesize a final answer. We did the obvious thing first: a FastAPI handler that ran the pipeline and streamed progress back to the browser over Server-Sent Events . It looked like this: # Naive in-handler version — what we wrote first, before learning the # work inside an SSE generator dies with the connection. The helpers # (fetch_text, chunk_text, summarize_chunk) are real — they live in # app/pipeline.py and survived the refactor. What got replaced is the # in-handler shape of the orchestrator below.…