Menu

Post image 1
Post image 2
1 / 2
0

I built a voice agent for Android in a weekend. Here's what actually worked

DEV Community·22Gstudios·29 days ago
#hvgSsKDW
Reading 0:00
15s threshold

Yesterday I posted this on X: https://x.com/22Gstudios/status/2051377769414791582 LetItDo is a voice agent for Android that actually finishes tasks. Solo, two and a half days, on top of an existing Auto.js fork called AutoX and Charm's Crush as the agent runtime. The architecture ended up being closer to what production agents like Perplexity Comet use than I expected, and the bugs that bit me were not the ones I planned for. I wanted my phone to do the boring stuff. Send a WhatsApp message to a contact. Open Spotify and play a song. Scroll Instagram and like a few posts. Stuff Siri and Google Assistant pretend to do but don't actually finish. Here's the honest writeup. Why this could even run on a phone Most agent runtimes are Python. Python is hostile to Android: no good way to ship the interpreter in your APK without dragging in 50MB+ of CPython and fighting NDK quirks. I needed the agent to run on the user's phone, not on some server they'd have to host. Charm's Crush is written in Go.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More