This is Part 5 of my series on the Microsoft Agent Framework. You can read the original post over on lukaswalter.dev . The Token Trap in Long Chats As we have seen in previous articles, stateless LLMs require us to continuously send the entire previous chat history so the AI can retain context. As each message is added to ongoing chats, input tokens accumulate. Even after many previous interactions, asking a simple question like “What is 1+1?” still results in the entire conversation history being sent. This will come with its own problems, like a full context window and rising costs. To address this, the framework introduces Chat Reducers. Message Counting The simplest form of a Chat Reducer is “Message Counting”. Here, you define a target count. The reducer keeps the most recent messages up to that count, while preserving the first system message if present.…