POST /v1/images/generations — text-to-image generation.

gemini-2.5-flash-image (Nano Banana).
response_format: "url" (default) → provider-hosted URL, ephemeral (~1 hour). Download within that window if you need to keep it.response_format: "b64_json" → inline base64 PNG.POST /v1/files.
1024x1024, 1792x1024, 1024x1792, 512x512. Check a model’s allowedParams.supported_sizes via GET /v1/models/{slug}.API key in format: Bearer inf_***
Model ID to use for generation
Text prompt describing the image to generate
Number of images (model-specific max: DALL-E 3=1, Imagen=4, GPT Image=4, DALL-E 2=10)
Image size. DALL-E: 256x256-1792x1024. GPT Image: 1024x1024/1024x1536/1536x1024/auto
DALL-E: standard/hd. GPT Image: low/medium/high/auto
DALL-E only
vivid, natural DALL-E only. GPT Image always returns b64_json
url, b64_json Aspect ratio (Google models, see /v1/models for allowed values)
Output resolution: Imagen Standard/Ultra "1K"|"2K", Gemini 3 image "1K"|"2K"|"4K"
Person generation policy (Google models)
dont_allow, allow_adult, allow_all Background type (GPT Image models only)
transparent, opaque, auto Output file format (GPT Image models only)
png, webp, jpeg Compression 0-100 (GPT Image, jpeg/webp only)
Match style of input images (GPT Image 1/1.5 only)
high, low Content filter level (GPT Image only)
auto, low What to exclude from the image (Imagen 4)
Seed for reproducibility (Imagen 4)
Image generation result. When stream=true, returns SSE (text/event-stream) with progress events, then a final chunk with the result.