Apple's On-Device Model is Terrible for Chat But Surprisingly Good at Structured Output and Tool …

1 / 2

Apple's On-Device Model is Terrible for Chat But Surprisingly Good at Structured Output and Tool Calling

DEV Community·Fernando Rodriguez·about 1 month ago

#CWRSNV7L

#swift #swiftui #appleintelligence #model #apple #tool

Reading 0:00

15s threshold

I've spent weeks stress-testing Apple's on-device model — the ~3B parameter one that runs on the Neural Engine of any Apple Silicon Mac. To test it thoroughly, I built Think Local , a macOS app that exercises every capability of the model: chat, image generation, structured output, tool calling, and parameter comparison. My conclusion: As a chatbot, the model is terrible. As a structured output and tool calling engine, it's surprisingly good. This distinction matters because it completely changes what you should use this model for. Chat is disappointing — and that's fine Apple's model has a 4,096-token context window. To put this in perspective: Claude has 1M tokens and GPT-4o has 128K. With Apple, add a 200-token system prompt, a 150-token schema, and three conversation turns, and you're already at 70% capacity. Free-form text quality isn't impressive either.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Apple's On-Device Model is Terrible for Chat But Surprisingly Good at Structured Output and Tool Calling