Menu

Post image 1
Post image 2
Post image 3
Post image 4
Post image 5
Post image 6
1 / 6
0

How I Monitor AI Agents: CloudWatch for Infra, Arize Phoenix for Traces and OpenTelemetry, LLM-as-Judge for Quality

DEV Community·Carlos Cortez 🇵🇪 [AWS Hero]·19 days ago
#NkVP5e3K
Reading 0:00
15s threshold

How I Monitor My AI Agents: CloudWatch for Infra, Arize Phoenix for Traces, LLM-as-Judge for Quality AI agents are not regular software. They reason, they call tools, they make decisions — and they can fail in ways that a simple health check will never catch. The response was technically successful, but was it actually helpful? The agent called the right tool, but did it interpret the result correctly? Traditional monitoring doesn't answer these questions. That's why I built a three-layer observability stack for my AI agents, and today I'm walking you through exactly how it works. 📓 Full working notebook : All the code in this post is validated and executable in the companion Jupyter notebook — including setup, tracing, evals, and cleanup. here as well: https://github.com/breakingthecloud/observability-ai-agents-phoenix-otel-strands The Problem with Monitoring AI Agents Here's the thing: when your agent answers "I don't have weather data for Paris" — is that a failure? Technically no, the agent ran fine.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More