Advanced audio dialog and generation with Gemini 2.5

📰

Advanced audio dialog and generation with Gemini 2.5

Google·Ankur Bapna·about 1 month ago

#mi #social #uni #close_icon #languages #gemini

Reading 0:00

15s threshold

Here’s a closer look at what’s new in Gemini 2.5 for audio dialog and generation. Tara Sainath Distinguished Research Scientist Sorry, your browser doesn't support embedded videos, but don't worry, you can download it and watch it with your favorite video player! Gemini is built from the ground up to be multimodal, natively understanding and generating content across text, images, audio, video and code. At I/O we showed how Gemini 2.5 marks a significant step forward with new capabilities in AI-powered audio dialog and generation. We’re already using these models to bring audio to users globally, across numerous products, prototypes and languages. NotebookLM’s Audio Overviews and Project Astra are just two examples. Here’s a closer look at what you can do with Gemini 2.5 native audio capabilities. Real-time audio dialog Human conversation is rich and nuanced, with meaning conveyed not just by what is said, but how it’s spoken — through tone, accent and even non-speech vocalizations, like laughter.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Advanced audio dialog and generation with Gemini 2.5