The Workflow Starts Before Extraction Most document automation diagrams start too late. They show a file entering an extraction step, a JSON response coming back, and a downstream system receiving clean fields. That diagram skips the part where many production failures are born: intake. The file did not appear from nowhere. It came from an upload form, email inbox, webhook, shared drive, customer portal, automation tool, or support ticket. It arrived with a tenant, a source, a filename, a sender, a case, an expected purpose, and a set of assumptions. If those assumptions are not captured at the boundary, every later step has to guess. Extraction then becomes responsible for too much. It has to infer document type, decide whether the file belongs to the current workflow, handle duplicates, explain wrong uploads, group attachments, pick schemas, and route review. That is not a content-processing problem. It is an intake contract problem. Designing intake as a contract sounds bureaucratic. It is the opposite.…