situation: You work in the operations team of a medium-sized company. Every day, your team processes order forms from different B2B customers. All of them arrive as PDFs. And in theory, they all contain the same information: customer ID, purchase order number, delivery date, and the ordered items. In practice, however, every document looks slightly different: One customer places the purchase order number in the top-left corner, the next one in the bottom-right corner. Some write “PO Number”, others use “Order ID”, “Order Reference”, or something completely different. For us humans, this is usually not a problem. We look at the document, understand the context, and immediately recognize which information is meant. For traditional automation systems, however, this becomes difficult: A regex rule can specifically search for “PO Number: “ . But what happens if the next customer uses “Order Reference: “ instead? That is exactly the problem I recreated for this article.…