Text-to-speech
Audio
Text-to-speech
POST /v1/audio/speech — synthesise speech from text.
POST
Text-to-speech
audio/mpeg / audio/wav depending on response_format).
Sample output
Generated withgemini-2.5-flash-preview-tts, voice Kore. Download tts.wav.
Parameters
voice— model-dependent (alloy,echo,onyx,nova,shimmer, etc.)response_format—mp3,wav,opus,flac,pcmspeed— 0.25–4.0
Authorizations
API key in format: Bearer inf_***
Body
application/json
Response
Binary audio stream. Content-Type reflects the requested response_format: audio/mpeg (mp3, default), audio/wav, audio/ogg (opus), audio/flac, audio/aac, or audio/pcm. Credits deducted are returned in the x-credits-used response header.
The response is of type file.

