Ask ChatGPT to estimate the carbs in your lunch. Now ask it again. And again. Five hundred times. You’d expect the same answer each time. It’s the same photo, the same model, the same question. But you won’t get the same answer. Not even close — and the differences are large enough to cause a hypoglycaemic emergency. That’s the central finding of a study I’ve just published as a preprint, and it has direct implications for anyone using AI-powered carb counting in a diabetes app. The study I submitted 13 food photographs — real meals, photographed on a phone, the way you’d actually use them — to four leading AI models: OpenAI GPT-5.4 , Anthropic Claude Sonnet 4.6 , Google Gemini 2.5 Pro and Google Gemini 3.1 Pro Preview . Each photo was sent over 500 times to each model. Same prompt every time. Same photo. Same settings. 26,904 queries in total. All at the lowest randomness setting these models offer.…