NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research. What's New NVIDIA today announced the release of Nemotron 3 Nano Omni, an open multimodal model that unifies reasoning across video, audio, image, and text modalities. The model is designed to process and reason about multiple input types simultaneously, moving beyond the text- or image-only capabilities of many existing open models. This is not a flagship large model but a "Nano" variant, suggesting a focus on efficiency and edge deployment rather than raw parameter count. The "Omni" designation implies native multimodal fusion rather than bolted-on modality adapters.…