A few months back I watched an external reviewer ask one question I could not answer. For the AI session that touched this medical device firmware on Tuesday, can you show me the inputs, the policy decisions, the outputs, and a signature that says nobody changed the record? We had beautiful dashboards. Token usage, latency, top tools, error rates. We did not have what she was asking for. That conversation forced a vocabulary change. We started saying observability and evidence as two separate words for two separate things. This article is the version of that conversation I wish someone had given me a year earlier. The short version Observability is consumed by an engineer trying to understand what the system is doing. Rich, sometimes lossy, meant for a dashboard. Sampling is fine. Cardinality limits are fine. Evidence is consumed by someone who was not there. A reviewer. A regulator. A future-you reading the file three months after the session.…