[Enhancement]: Integration of Gemini 2.0 (audio output) #6067

goreply-de · 2025-02-26T14:51:55Z

What features would you like to see added?

Adding support to generate audio output would be great.

More details

With gemini-2.0-flash-exp Gemini can now support the generation of audio output. The generated audio response can be downloaded afterwards as well.

To access this feature on GCP you need to navigate to ‘Vertex AI’ → ‘Vertex AI Studio’ → ‘Freeform’ → Select ‘gemini-2.0-flash-exp’ as model and ‘Audio’ as response output type. The attached recording shows the process (sadly without the audio). The text was read out when clicking on the play button. Running the same prompt multiple times results in different audio files (different voices, speed, pronunciation, etc.)

The UI needs to be adjusted for models that allow this kind of output. Probably some endpoints need to be adjusted as well to deal with the new kind of data.

Note: Currently the feature is still experimental.

Which components are impacted by your request?

UI, General, Endpoints

Pictures

Code of Conduct

I agree to follow this project's Code of Conduct

goreply-de added the ✨ enhancement New feature or request label Feb 26, 2025

Repository owner locked and limited conversation to collaborators Mar 3, 2025

danny-avila converted this issue into a discussion Mar 3, 2025

danny-avila reopened this Mar 3, 2025

Repository owner unlocked this conversation Mar 3, 2025

Repository owner locked and limited conversation to collaborators Mar 3, 2025

danny-avila converted this issue into a discussion Mar 3, 2025

danny-avila reopened this Mar 3, 2025

Repository owner unlocked this conversation Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: Integration of Gemini 2.0 (audio output) #6067

[Enhancement]: Integration of Gemini 2.0 (audio output) #6067

goreply-de commented Feb 26, 2025

[Enhancement]: Integration of Gemini 2.0 (audio output) #6067

[Enhancement]: Integration of Gemini 2.0 (audio output) #6067

Comments

goreply-de commented Feb 26, 2025

What features would you like to see added?

More details

Which components are impacted by your request?

Pictures

Code of Conduct