Rate limits

Per-API-key limits

Primary rate-limiting is per API key, implemented as a sliding window in Redis. Limit varies by plan and any attached quota preset:

Plan	Default RPM	Configurable max (preset)
Free	10	10
Starter	30	60
Growth	60	120
Pro	120	240
Business	200	400
Scale	400	800
Enterprise	Custom	Custom

You see the effective value in Settings → API Keys next to each key.

Exceeded response

HTTP 429 Too Many Requests
Retry-After: 12

{
  "error": {
    "message": "Rate limit: 30 req/min on this API key",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

The Retry-After header tells you how many seconds until your window frees up. Honour it.

Daily token budgets

In addition to RPM, some plans cap total tokens per day (rateLimitTpd in your quota preset). Hitting this returns the same 429 with a different message.

Global safety net

We enforce a 5 000 req / 10 min per IP ceiling at the edge to stop scraping. This almost never trips for real users — only poorly-configured crawlers.

Best practices

Always respect Retry-After. Don’t hammer.
Use exponential backoff with jitter. 1s → 2s → 4s → 8s (+random 0–500 ms) is fine.
Consider fallback chains. If 429s on gpt-4o matter, set a fallback to gpt-4o-mini or gemini-flash — the gateway handles retry for you.
Use separate keys per environment. Dev traffic with a 30 rpm preset, prod on 400. Never mix.
Queue on your side too. For batch workloads, implement a local rate limiter that stays below your key’s RPM — that way short bursts don’t get rejected.

Rate limits on specific endpoints

/contact/sales — 5 req / hour per IP (spam protection)
/public/plans and /public/models — 30 req / min per IP
/v1/files upload — per-key RPM applies; additionally serialised by workspace (one upload at a time)

Overview

Chat Completions

Embeddings

Images

Audio

Video

Music

Files

Models

Per-API-key limits

Exceeded response

Daily token budgets

Global safety net

Best practices

Rate limits on specific endpoints

Overview

Chat Completions

Embeddings

Images

Audio

Video

Music

Files

Models

​Per-API-key limits

​Exceeded response

​Daily token budgets

​Global safety net

​Best practices

​Rate limits on specific endpoints

Per-API-key limits

Exceeded response

Daily token budgets

Global safety net

Best practices

Rate limits on specific endpoints