GET /v1/models/{slug} to branch feature-by-feature in your app.
| Capability | Meaning |
|---|---|
supportsChat | Accepts chat completions format |
supportsStreaming | Can stream tokens via SSE |
supportsVision | Accepts image inputs |
supportsPdf | Accepts PDF natively (otherwise gateway auto-converts) |
supportsTools | Honours tools array + tool_choice |
supportsJsonMode | response_format: {type: "json_object"} or JSON schema |
supportsImages | Generates images as output (different from Vision) |
isReasoning | Chain-of-thought reasoning model (slower but more accurate on complex tasks) |
isFlagship | Provider’s top-of-line model (highest quality + cost) |
Provider-by-provider highlights
OpenAI
- gpt-5-4 — flagship, all capabilities
- gpt-4o — vision ✓, tools ✓, JSON ✓, streaming ✓
- o3 / o1 — reasoning ✓ (no streaming tokens — final result only)
- gpt-4o-audio — audio input natively in chat
Anthropic
- claude-opus-4-7 — flagship, vision ✓, PDF ✓, tools ✓
- claude-opus-4-6 — vision ✓, PDF ✓, tools ✓
- claude-sonnet-4-6 — same with better cost/perf
- Audio input — not supported; use
/v1/audio/transcriptionsfirst
- gemini-2-5-pro — vision ✓, PDF ✓, 2M context window
- gemini-2-5-flash — fast, huge context, multimodal
- gemini-2-0-flash — budget option
xAI
- grok-3 — vision ✓, JSON ✓
Alibaba
- qwen3-max — vision + PDF + long context (262k)
- qwen3-5-plus — 1M context
- qwen-vl-max — vision-specialised
PDF fallback
If a model withoutsupportsPdf receives a PDF, our gateway:
- Extracts text via pdftotext
- Renders pages to PNG (for vision models)
- Injects text + images into the message
- Bills a small
pdf_extraction_per_pagefee

