Followup to the M5 Max long-context post. Comments asked for perplexity, KL divergence, asymmetric K/V combos, and a 64K data point. Overnight bench delivered all four. q8_0 KV is essentially free at 4k context (KL 0.0016, top-1 token agreement 98.6%).…