If you're running an AI agent on Amazon Bedrock and injecting persistent memory into every conversation, where you put that memory in the request matters a lot — both for how well the agent uses it and for what it costs you. I learned this the direct way while connecting agent-memory-daemon to OpenClaw running on Amazon Bedrock AgentCore Runtime. The setup works beautifully. My agent now remembers my preferences, my projects, and the weird Bedrock timeout I debugged three weeks ago. But along the way I hit a subtle interaction between memory injection and prompt caching that's worth documenting. This post walks through the architecture, the Bedrock prompt caching rule that tripped me up, and the one-line fix that cut my cache-related costs dramatically. The setup: persistent memory for a serverless agent OpenClaw lives in a container on AgentCore Runtime. AgentCore freezes the container when idle, which is great for cost (zero idle spend) but hostile to long-term memory (every wake is a blank slate).…