TRiP: 15,000 lines of C implementing a complete transformer AI engine from scratch [Project] I'm a firmware engineer (17 years in embedded systems). In 18 months (up to August 2025), during my lunch breaks and weekend nights, I built a complete transformer engine in C: inference, training with full backpropagation, tokenizer(+vocabulary builder!), chat, and vision; so that's no ML frameworks, and no Python; it's just C, libjpeg (for vision), and X11 (same). Things of interest: \- bf16/f16/f32 mixed precision with manual casting \- mmap-based weight loading for running large models on limited RAM \- the whole thing compiles with a 10-line Makefile: gcc, -Ofast, -fopenmp It loads and runs real models (Gemma, Llama 2, GPT-2, PaliGemma) from standard HuggingFace checkpoint formats (SafeTensors). The purpose is purely educational; I built it to understand transformers at the lowest level, and structured the code to be readable: every math operation has its forward and backward implementation side by side.…