2026-04
- Files API — OpenAI-compatible
POST /v1/filesplusfile_idreferences in chat completions. Workspace-scoped, MIME-sniffed, idempotent. See Files API. - Playground file access scope — Playground now shares the same workspace file pool as the API; uploads from either side are referenceable from both.
- Billing role — new
Billingworkspace role: invoices, payment methods and Playground access, no API key management. See Members and roles. - Promo site — fresh marketing site at
infery.aiwith full legal suite (Privacy, Terms, AUP, Subprocessors). - Video polling mirror enqueue fix — generated videos no longer occasionally get stuck in
processingstate when the upstream provider returns asynchronously.
2026-03
- Fallback chains — configure per-source-model fallback ladders in Settings → Fallbacks. Headers
x-model-used/x-fallback-from/x-fallback-depthon every response. See Fallback chains. - Music generation —
POST /v1/music/generationswith Suno and Udio backends. - Budget alerts — email + in-app notifications at 50/75/90% of plan + auto-pause on exhaustion.
- OpenAI SDK extras —
credits_usedin the streaming usage chunk, ignored by upstream SDKs but readable by raw parsers.
2026-02
- Streaming for all chat models — including Anthropic and Google, normalised to OpenAI SSE format.
- Vision + PDF — automatic PDF-to-image conversion on the gateway for models without native PDF support.
- Quotas and presets — workspace-level monthly token caps, per-key rate-limit profiles. See Quotas.
2026-01
- Public launch.
- OpenAI-compatible chat, embeddings, image, audio, video endpoints.
- OpenAI, Anthropic, Google, xAI, DeepSeek, Qwen providers.
- Playground with chat, image, video, audio modalities.
- Subscription plans + topups via Stripe.
What’s next (tentative)
These are not commitments — see the roadmap for the canonical list.- Batch API —
POST /v1/batchesfor cheap async bulk inference (50% discount on most models). - Workspace-level PRC opt-in — replace the support-email gate for Qwen/DeepSeek with a self-serve toggle.
- Fine-tuning passthrough — submit + monitor OpenAI / Google fine-tune jobs through the Infery key.
- Realtime API — websocket bidirectional audio for voice agents.
- EU region — primary processing in
europe-west4with full data residency.
Staying informed
- RSS of this changelog:
https://docs.infery.ai/reference/changelog/rss.xml - Email on every minor or major release: opt in at Settings → Notifications → Product updates
- Status page:
status.infery.ai

