NVIDIA's NeMo RL speculative decoding achieves 1.8× rollout speedup at 8B and projects 2.5× at 235B, cutting RL training time by over half. NVIDIA's NeMo RL speculative decoding achieves a 1.8× rollout generation speedup on 8B models. The technique projects a 2.5× end-to-end speedup at 235B parameters, cutting RL training wall-clock time by over half. Key facts 1.8× rollout generation speedup at 8B parameters Projected 2.5× end-to-end speedup at 235B Reduces RL training wall-clock time by over half Validated on internal benchmarks by NVIDIA Part of NeMo open-source framework NVIDIA published research showing speculative decoding applied to reinforcement learning (RL) training in NeMo yields significant wall-clock speedups. The key result: a 1.8× faster rollout generation on 8B-parameter models, with a projected 2.5× end-to-end speedup at 235B parameters [According to the source].…