Skip to main content
Video generation is asynchronous — you submit a job, then poll until it finishes. For the raw endpoint see Videos API. This guide covers the patterns you’ll actually write.

Pick a model

ModelStrong forMax durationNotes
veo-3-1 (Google)Photorealism, motion coherence8 sSlower, higher cost
veo-3-1-fastIteration4 sDefault for prototyping
sora-2 (OpenAI)Long, narrative scenes20 sLimited availability
wan-2-1 (Alibaba)Text-to-video6 sPRC opt-in required
grok-videoRealistic, low refusal6 sxAI
Filter the catalog: GET /v1/models?modality=video. Check allowedParams for supported resolution, aspect_ratio, duration_seconds.

Sample output

4-second clip from veo-3.0-fast-generate-001, 16:9, prompt: “A vibrant tropical coral reef in crystal-clear turquoise water, colorful fish darting between coral, soft golden sunlight streaming down, slow cinematic dolly-in, rich saturated colors.”

Submit + poll pattern

The minimum viable client:
python
import time

def generate_video(prompt: str, model: str = "veo-3-1-fast") -> dict:
    job = client.post(
        "/v1/videos/generations",
        json={
            "model": model,
            "prompt": prompt,
            "duration_seconds": 4,
            "resolution": "720p",
            "aspect_ratio": "16:9",
        },
    ).json()
    job_id = job["job_id"]

    while True:
        status = client.get(f"/v1/videos/generations/{job_id}").json()
        if status["status"] == "completed":
            return status["result"]
        if status["status"] == "failed":
            raise RuntimeError(status.get("error", "video generation failed"))
        time.sleep(5)
Poll every 5 seconds — more often wastes calls and may hit your RPM. Most videos finish in 30–90 s; the hard timeout is 1 hour.

Async with progress

In a UI, surface progress (0–100) so users see motion:
node
async function* watchJob(jobId: string) {
  while (true) {
    const r = await fetch(`https://api.infery.ai/v1/videos/generations/${jobId}`, {
      headers: { Authorization: `Bearer ${process.env.INFERY_API_KEY}` },
    }).then((r) => r.json());

    yield r;

    if (r.status === 'completed' || r.status === 'failed') return;
    await new Promise((res) => setTimeout(res, 5000));
  }
}

for await (const update of watchJob(jobId)) {
  ui.setProgress(update.progress ?? 0);
  if (update.status === 'completed') ui.showVideo(update.result.url);
}
For server-side workflows, push the job_id onto a queue and let a worker poll. Never block an HTTP request handler waiting for a video — the request will time out long before the video is ready.

Image-to-video

Animate a still image. Pass the source via image_url or inline base64:
python
job = client.post("/v1/videos/generations", json={
    "model": "veo-3-1",
    "prompt": "Slow camera dolly-in, gentle wind in the leaves",
    "image_url": "https://cdn.example.com/forest.jpg",
    "duration_seconds": 5,
    "aspect_ratio": "16:9",
}).json()
Image-to-video typically gives better physical coherence than text-to-video for tricky subjects (faces, hands, complex props) — start with the still you want and let the model only produce motion.

Persistence

The result.url is already mirrored to our storage — it’s a signed URL pointing at GCS, not the upstream provider’s ephemeral URL. It survives upstream URL expiry. To download or pin permanently:
python
import httpx
mp4 = httpx.get(result["url"]).content
f = client.files.create(file=("hero.mp4", mp4), purpose="user_data")
# Now f.id is a forever-stable handle in your workspace
The mirror happens automatically on completion — no extra call needed. If the mirror itself fails (rare), we still return the upstream URL with result.mirror_status: "failed" so you can retry from your side.

Prompting

Video models reward physical coherence over visual fidelity:
  • Describe motion, not just appearance: “camera tracks left as the cyclist rides past” beats “a cyclist”.
  • Specify subject continuity: “the same red car remains in frame throughout”.
  • Limit complexity: 1–2 subjects, 1 camera move, 1 lighting condition.
  • For people: shorter clips and specific actions (“nodding”, “turning to look at camera”) avoid the “wandering eyes” failure mode.

Cost ballpark

Per 4-second 720p clip:
  • Veo 3.1 fast: ~80 credits
  • Veo 3.1: ~180 credits
  • Sora 2: ~200 credits
  • Wan 2.1: ~40 credits
Your invoice line shows model + duration + resolution. Use Veo fast or Wan for iteration; promote to Veo 3.1 / Sora for finals.

When jobs fail

Common reasons:
  • Safety refusal from the provider — 400-class, won’t be retried by fallback. Reword and resubmit.
  • Provider quota — 429 on the upstream. Configure a fallback chain and let a different model take it.
  • Upstream timeout — rare; your job is auto-cancelled at 1 hour with status: "failed" and error: "timeout". Resubmit.
The job_id is logged in Settings → Usage → Recent requests for traceability.