Menu

#Turbo4

1 post

Feed
1 of 1 post
TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max
🖼️
0

TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max

DEV Community·Christopher Maher·about 1 month ago
#ZPT3PBRz
#ai#llm#kubernetes#opensource#q8_0#turbo4

Followup to the M5 Max long-context post. Comments asked for perplexity, KL divergence, asymmetric K/V combos, and a 64K data point. Overnight bench delivered all four. q8_0 KV is essentially free at 4k context (KL 0.0016, top-1 token agreement 98.6%).…

15s
Read More