Scaling NVFP4 Inference for FLUX.2 on NVIDIA Blackwell Data Center GPUs

1 / 6

Scaling NVFP4 Inference for FLUX.2 on NVIDIA Blackwell Data Center GPUs

NVIDIA Technical Blog·Sandro Cavallari·about 1 month ago

#GyaAi8Dc

#x2d #contentcreationrendering #datacentercloud #cloudservices #blackwell #flux

Reading 0:00

15s threshold

In 2025, NVIDIA partnered with Black Forest Labs (BFL) to optimize the FLUX.1 text-to-image model series, unlocking FP4 image generation performance on NVIDIA Blackwell GeForce RTX 50 Series GPUs . As a natural extension of the latent diffusion model, FLUX.1 Kontext [dev] proved that in-context learning is a feasible technique for visual-generation models, not just large language models (LLMs) . To make this experience more widely accessible, NVIDIA collaborated with BFL to enable a near real-time editing experience using low-precision quantization . FLUX.2 is a significant leap forward, offering the public multi-image references and quality comparable to the best enterprise models. However, because FLUX.2 [dev] requires substantial compute resources, BFL, Comfy, and NVIDIA collaborated to achieve a major breakthrough: reducing the FLUX.2 [dev] memory requirement by more than 40% and enabling local deployment through ComfyUI.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Scaling NVFP4 Inference for FLUX.2 on NVIDIA Blackwell Data Center GPUs