POST /v1/audio/speech — synthesise speech from text.
audio/mpeg / audio/wav depending on response_format).
gemini-2.5-flash-preview-tts, voice Kore. Download tts.wav.
voice — model-dependent (alloy, echo, onyx, nova, shimmer, etc.)response_format — mp3, wav, opus, flac, pcmspeed — 0.25–4.0API key in format: Bearer inf_***
Binary audio stream. Content-Type reflects the requested response_format: audio/mpeg (mp3, default), audio/wav, audio/ogg (opus), audio/flac, audio/aac, or audio/pcm. Credits deducted are returned in the x-credits-used response header.
The response is of type file.