Two months ago I started a portfolio project: build three small specialized language models for healthcare practice intake, benchmark each one honestly against frontier APIs, and write about what I learned. The goal was to build the case that small specialized models still have a place in the 2026 toolkit alongside frontier LLMs — not as replacements, but as the first stage of a hybrid pipeline. This is the post about the third model. It's also the post about the suite — what worked across all three, what didn't, and the pattern that emerged. The three models, all on Hugging Face: clarioscope-intent-deberta-v1 — 184M DeBERTa-v3-base, 7-class intent classification. Within 4 pp of Claude Haiku 4.5, 22× faster on CPU. methodology post → clarioscope-phi-deberta-v1 — 125M RoBERTa-base, 18-category PHI span detection (HIPAA Safe Harbor). Loses on aggregate but triples frontier F1 on geographic locations . methodology post → clarioscope-insurance-v1 — 125M RoBERTa-base, 12-field insurance / billing extraction.…