BitForge: Run LLMs on Microcontrollers — HashtagPLUS

#+HashtagPLUS#Hashtag the Web... #Tag your World!

Import Manifesto

Menu

1 / 2

0

0

BitForge: Run LLMs on Microcontrollers

DEV Community·Aman Sachan·about 1 month ago

#llm #esp32 #iot #python #quantization #tokens

Reading 0:00

15s threshold

I got GPT-2 running on an Arduino! Here's the quantization pipeline.

Process:

Q4_K_M quantization via llama.cpp
Memory-mapped flash for weight storage
Optimized matvec for ARM Cortex-M
KV cache quantization

Results:

Arduino Nano 33 BLE: 3 tokens/sec
ESP32-S3: 15 tokens/sec
Raspberry Pi Pico: 8 tokens/sec

Code: github.com/AmSach/bitforge

Hardware requirements: 512KB RAM, 2MB flash.