The pipeline hit 332 tracked releases last week. I thought that was a milestone worth celebrating until I looked at the dedup stats. Turns out 23 of those "distinct" entries were the same model release, just named differently across sources. "Llama-3.1-8B-Instruct" and "Meta-Llama-3.1-8B-Instruct" and "llama3.1:8b" all referring to the exact same thing. My naive string-matching dedup was silently failing for months. The way I found out: I was hand-checking a batch and noticed three entries in the feed that were clearly the same release. Dug into the DB. Found 23 collision clusters. The worst one had 7 variants of the same model across different sources. The fix wasn't complicated — normalized form comparison, slug the model name, strip vendor prefixes, lowercase everything before comparing. Took about 90 minutes to implement and run a migration. But here's the part that actually stung: I had been using "332 releases tracked" as a public number. Now it's 309 once you deduplicate properly.…