Generate an image
POST /v1/images/generations — text-to-image generation.
Sample output

gemini-2.5-flash-image (Nano Banana).
Response formats
response_format: "url"(default) → provider-hosted URL, ephemeral (~1 hour). Download within that window if you need to keep it.response_format: "b64_json"→ inline base64 PNG.
POST /v1/files.
Supported models
See Models catalog filtered to image modality. OpenAI (DALL·E), Google (Imagen, Gemini Image), xAI (Grok Image), Alibaba (Qwen Image), Suno-music, FLUX (self-hosted) — all accessed by slug.Aspect ratios and sizes
Per-model; common values:1024x1024, 1792x1024, 1024x1792, 512x512. Check a model’s allowedParams.supported_sizes via GET /v1/models/{slug}.Authorizations
API key in format: Bearer inf_***
Body
Model ID to use for generation
Text prompt describing the image to generate
Number of images (model-specific max: DALL-E 3=1, Imagen=4, GPT Image=4, DALL-E 2=10)
Image size. DALL-E: 256x256-1792x1024. GPT Image: 1024x1024/1024x1536/1536x1024/auto
DALL-E: standard/hd. GPT Image: low/medium/high/auto
DALL-E only
vivid, natural DALL-E only. GPT Image always returns b64_json
url, b64_json Aspect ratio (Google models, see /v1/models for allowed values)
Output resolution: Imagen Standard/Ultra "1K"|"2K", Gemini 3 image "1K"|"2K"|"4K"
Person generation policy (Google models)
dont_allow, allow_adult, allow_all Background type (GPT Image models only)
transparent, opaque, auto Output file format (GPT Image models only)
png, webp, jpeg Compression 0-100 (GPT Image, jpeg/webp only)
Match style of input images (GPT Image 1/1.5 only)
high, low Content filter level (GPT Image only)
auto, low What to exclude from the image (Imagen 4)
Seed for reproducibility (Imagen 4)
Response
Image generation result. When stream=true, returns SSE (text/event-stream) with progress events, then a final chunk with the result.

