Menu

Post image 1
Post image 2
1 / 2
0

I Compressed GPT-2 to Run on an Arduino

DEV Community·Aman Sachan·about 1 month ago
#HeVpyWW6
Reading 0:00
15s threshold

Aman Sachan

The Impossible Problem

GPT-2 Small: 124M parameters = ~500MB

Arduino Uno: 2KB RAM, 32KB Flash

Gap: ~250,000x

The Solution

I built BitForge - aggressive LLM quantization for microcontrollers.

What It Does

  • 1-bit to 8-bit quantization
  • Adaptive per-layer bit width
  • Pure C99 output
  • No dependencies

Results

  • 8x compression achieved
  • 99.3% correlation preserved
  • Tested on ESP32, Arduino, STM32 targets

Try It

pip install bitforge
bitforge compress gpt2 --target esp32-s3 --bits 4

Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/AmSach/bitforge

Read More