Video generation

Video generation is asynchronous — you submit a job, then poll until it finishes. For the raw endpoint see Videos API. This guide covers the patterns you’ll actually write.

Pick a model

Model	Strong for	Max duration	Notes
`veo-3-1` (Google)	Photorealism, motion coherence	8 s	Slower, higher cost
`veo-3-1-fast`	Iteration	4 s	Default for prototyping
`sora-2` (OpenAI)	Long, narrative scenes	20 s	Limited availability
`wan-2-1` (Alibaba)	Text-to-video	6 s	PRC opt-in required
`grok-video`	Realistic, low refusal	6 s	xAI

Filter the catalog: GET /v1/models?modality=video. Check allowedParams for supported resolution, aspect_ratio, duration_seconds.

Sample output

4-second clip from veo-3.0-fast-generate-001, 16:9, prompt: “A vibrant tropical coral reef in crystal-clear turquoise water, colorful fish darting between coral, soft golden sunlight streaming down, slow cinematic dolly-in, rich saturated colors.”

Submit + poll pattern

The minimum viable client:

python

import time

def generate_video(prompt: str, model: str = "veo-3-1-fast") -> dict:
    job = client.post(
        "/v1/videos/generations",
        json={
            "model": model,
            "prompt": prompt,
            "duration_seconds": 4,
            "resolution": "720p",
            "aspect_ratio": "16:9",
        },
    ).json()
    job_id = job["job_id"]

    while True:
        status = client.get(f"/v1/videos/generations/{job_id}").json()
        if status["status"] == "completed":
            return status["result"]
        if status["status"] == "failed":
            raise RuntimeError(status.get("error", "video generation failed"))
        time.sleep(5)

Poll every 5 seconds — more often wastes calls and may hit your RPM. Most videos finish in 30–90 s; the hard timeout is 1 hour.

Async with progress

In a UI, surface progress (0–100) so users see motion:

node

async function* watchJob(jobId: string) {
  while (true) {
    const r = await fetch(`https://api.infery.ai/v1/videos/generations/${jobId}`, {
      headers: { Authorization: `Bearer ${process.env.INFERY_API_KEY}` },
    }).then((r) => r.json());

    yield r;

    if (r.status === 'completed' || r.status === 'failed') return;
    await new Promise((res) => setTimeout(res, 5000));
  }
}

for await (const update of watchJob(jobId)) {
  ui.setProgress(update.progress ?? 0);
  if (update.status === 'completed') ui.showVideo(update.result.url);
}

For server-side workflows, push the job_id onto a queue and let a worker poll. Never block an HTTP request handler waiting for a video — the request will time out long before the video is ready.

Image-to-video

Animate a still image. Pass the source via image_url or inline base64:

python

job = client.post("/v1/videos/generations", json={
    "model": "veo-3-1",
    "prompt": "Slow camera dolly-in, gentle wind in the leaves",
    "image_url": "https://cdn.example.com/forest.jpg",
    "duration_seconds": 5,
    "aspect_ratio": "16:9",
}).json()

Image-to-video typically gives better physical coherence than text-to-video for tricky subjects (faces, hands, complex props) — start with the still you want and let the model only produce motion.

Persistence

The result.url is already mirrored to our storage — it’s a signed URL pointing at GCS, not the upstream provider’s ephemeral URL. It survives upstream URL expiry. To download or pin permanently:

python

import httpx
mp4 = httpx.get(result["url"]).content
f = client.files.create(file=("hero.mp4", mp4), purpose="user_data")
# Now f.id is a forever-stable handle in your workspace

The mirror happens automatically on completion — no extra call needed. If the mirror itself fails (rare), we still return the upstream URL with result.mirror_status: "failed" so you can retry from your side.

Prompting

Video models reward physical coherence over visual fidelity:

Describe motion, not just appearance: “camera tracks left as the cyclist rides past” beats “a cyclist”.
Specify subject continuity: “the same red car remains in frame throughout”.
Limit complexity: 1–2 subjects, 1 camera move, 1 lighting condition.
For people: shorter clips and specific actions (“nodding”, “turning to look at camera”) avoid the “wandering eyes” failure mode.

Cost ballpark

Per 4-second 720p clip:

Veo 3.1 fast: ~80 credits
Veo 3.1: ~180 credits
Sora 2: ~200 credits
Wan 2.1: ~40 credits

Your invoice line shows model + duration + resolution. Use Veo fast or Wan for iteration; promote to Veo 3.1 / Sora for finals.

When jobs fail

Common reasons:

Safety refusal from the provider — 400-class, won’t be retried by fallback. Reword and resubmit.
Provider quota — 429 on the upstream. Configure a fallback chain and let a different model take it.
Upstream timeout — rare; your job is auto-cancelled at 1 hour with status: "failed" and error: "timeout". Resubmit.

The job_id is logged in Settings → Usage → Recent requests for traceability.

Get started

Playground

Workspaces

Billing

Models

Guides

Reference

Pick a model

Sample output

Submit + poll pattern

Async with progress

Image-to-video

Persistence

Prompting

Cost ballpark

When jobs fail

Get started

Playground

Workspaces

Billing

Models

Guides

Reference

​Pick a model

​Sample output

​Submit + poll pattern

​Async with progress

​Image-to-video

​Persistence

​Prompting

​Cost ballpark

​When jobs fail

Pick a model

Sample output

Submit + poll pattern

Async with progress

Image-to-video

Persistence

Prompting

Cost ballpark

When jobs fail