AI agents are slow by nature. When a user sends a message to an AI-powered chatbot, what looks like a simple interaction often triggers a chain of events under the hood. The agent reasons about the request, identifies which tools it needs, executes those tools against backend services, waits for responses, and then synthesizes a final reply. In a well-architected system, this chain can span multiple microservices, LLM calls, and external API round-trips. The chatbot architecture I will be describing in this post is built around the Model Context Protocol (MCP), a standard that allows AI agents to discover and execute backend tools dynamically. The backend is composed of four microservices, each owning a distinct domain: conversation persistence, RAG-powered recommendations, domain-specific operations, and the MCP gateway itself that brokers all tool discovery and execution.…