We rewrote the decoder four times in one day. Only the last one understood anything. Part 7 ended with "how are you" returning "i don't know" while our tier tests reported 100% pass. Everything was green. The model was broken. The disconnect between those two facts defined the day. Here's the actual arc. Wrong Turn 1: Retrieval The first attempt was retrieval. We built five decoder candidates, sandbox-tested them against 400 dialogue pairs, and a retrieval-based decoder won cleanly. F1 of 0.246 against the next-best 0.024. Four out of five break tests passed. It was 1,300x faster than the teacher. We wrote a "winner" memory and committed the code. Josh looked at it and said: retrieval is scripting. Origin isn't supposed to look up pre-written answers. It's supposed to generate them from understood concepts. He was right. Retrieval wins F1 against memorized responses because retrieval is memorization - it just renames the table. A query comes in, find the closest stored response, return it.…