When I started building Clicky — a Windows AI assistant that reads your screen and answers out loud — I had to make a fundamental choice: cloud AI or local AI? I chose local (Ollama). Here's exactly why, and what I learned the hard way. The problem with cloud AI for a screen assistant Clicky's core loop is: Take a screenshot of your screen Record your voice question Send both to an LLM Speak the answer back Step 3 is where cloud AI becomes a problem. You're sending a screenshot of your screen to a remote server every single time someone presses the hotkey. Think about what's on your screen: Passwords in password managers Emails with sensitive info Code with API keys Personal documents I wasn't comfortable sending that to OpenAI servers (or any server) by default. And I couldn't expect users to be comfortable with it either. What Ollama actually gives you Ollama lets you run LLMs locally. Pull a model, it runs on your GPU/CPU, responses never leave your machine.…