Menu

Post image 1
Post image 2
Post image 3
Post image 4
1 / 4
0

Building Dynamic Audio with Emotion & Pace: Gemini 3.1 Flash TTS, Angular & Firebase Cloud Functions [GDE]

DEV Community·Connie Leung·about 1 month ago
#aOaMLrG0
Reading 0:00
15s threshold

Google released the Gemini 3.1 Flash TTS Preview model for AI audio generation in the Gemini API, Gemini in Vertex AI, and Gemini AI Studio. This model introduces a new Audio tags feature to exhibit expressive human emotion, pace, and style. This application explores Firebase AI Logic to analyze an uploaded image to generate recommendations, description, alternative tags, and an obscure fact. The obscure fact is sent to a Firebase Cloud Function to generate an audio using a Gemini TTS model. The Cloud Function returns the stream to an Angular application that converts it to a Blob URL object. An audio player sets the URL to the source that users can click the Play button to play the stream. In this blog post, I migrate my application to use the Gemini 3.1 Flash TTS Preview model and create a signal form in Angular to input a scene, emotion, and pace.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More