I Compressed GPT-2 to Run on an Arduino — HashtagPLUS

#+HashtagPLUS#Hashtag the Web... #Tag your World!

Import Manifesto

Menu

1 / 2

0

0

I Compressed GPT-2 to Run on an Arduino

DEV Community·Aman Sachan·about 1 month ago

#llm #embedded #tinyml #bitforge #arduino #quantization

Reading 0:00

15s threshold

The Impossible Problem

GPT-2 Small: 124M parameters = ~500MB

Arduino Uno: 2KB RAM, 32KB Flash

Gap: ~250,000x

The Solution

I built BitForge - aggressive LLM quantization for microcontrollers.

What It Does

1-bit to 8-bit quantization
Adaptive per-layer bit width
Pure C99 output
No dependencies

Results

8x compression achieved
99.3% correlation preserved
Tested on ESP32, Arduino, STM32 targets

Try It

pip install bitforge
bitforge compress gpt2 --target esp32-s3 --bits 4

GitHub: https://github.com/AmSach/bitforge