Skip to main content

Overview

LemonData provides access to video generation models through a single unified API. Video generation is asynchronous: submit a request, receive a task ID and poll_url, then poll for the final result.
The model inventory changes over time. For the latest public availability, use the Models API or visit the Models page.
If a create response returns poll_url, call that exact URL. When it points to /v1/tasks/{id}, treat that as the canonical fixed status endpoint.
Audio behavior is model-dependent. In LemonData, Veo 3 family requests default to audio-on when output_audio is omitted. Some public models are silent-only or do not expose a stable toggle.
For production integrations, prefer publicly reachable https URLs over inline base64 for images, videos, and audio. Inline data: URLs are still supported by compatible models, but URLs are easier to retry, inspect, and debug.

Async Workflow

Public Operations

LemonData’s current public video contract centers on these operations:
  • text-to-video
  • image-to-video
  • reference-to-video
  • start-end-to-video
  • video-to-video
  • motion-control
The request contract also accepts audio-to-video and video-extension for model-specific flows, but the current generally enabled public model inventory in this docs build does not include a broadly enabled model that advertises either capability.

Capability Matrix

Legend: ✅ Supported by at least one currently enabled public model in that provider family | ❌ Not currently represented by an enabled public model
SeriesT2VI2VReferenceStart-EndV2VMotion
OpenAI
Kuaishou
Google
ByteDance
MiniMax
Alibaba
Shengshu
xAI
Other

Capability Definitions

  • T2V (Text-to-Video): Generate video from a text prompt.
  • I2V (Image-to-Video): Animate a starting image. For the broadest compatibility, provide image_url.
  • Reference: Condition generation on one or more reference images via reference_images.
  • Start-End: Control the first and last frames with start_image and end_image.
  • V2V (Video-to-Video): Use an existing video as the primary source input.
  • Motion: Combine a subject image with a motion reference video.

Current Public Model Inventory

OpenAI

ModelPublic operations
sora-2Text-to-video, image-to-video
sora-2-proText-to-video, image-to-video
sora-2-pro-storyboardImage-to-video

Kuaishou

ModelPublic operations
kling-3.0-motion-controlMotion control
kling-3.0-videoText-to-video, image-to-video, start-end-to-video
kling-v2.5-turbo-proText-to-video, image-to-video, start-end-to-video
kling-v2.5-turbo-stdText-to-video, image-to-video
kling-v2.6-proText-to-video, image-to-video, start-end-to-video
kling-v2.6-stdText-to-video, image-to-video
kling-v3.0-proText-to-video, image-to-video, start-end-to-video
kling-v3.0-stdText-to-video, image-to-video, start-end-to-video
kling-video-o1-proText-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video
kling-video-o1-stdText-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video

Google

ModelPublic operations
veo3Text-to-video, image-to-video
veo3-fastText-to-video, image-to-video
veo3-proText-to-video, image-to-video
veo3.1Text-to-video, image-to-video, reference-to-video, start-end-to-video
veo3.1-fastText-to-video, image-to-video, reference-to-video, start-end-to-video
veo3.1-proText-to-video, image-to-video, start-end-to-video

ByteDance

ModelPublic operations
seedance-1.5-proText-to-video, image-to-video

MiniMax

ModelPublic operations
hailuo-2.3-fastImage-to-video
hailuo-2.3-proText-to-video, image-to-video
hailuo-2.3-standardText-to-video, image-to-video

Alibaba

ModelPublic operations
wan-2.2-plusText-to-video, image-to-video
wan-2.5Text-to-video, image-to-video
wan-2.6Text-to-video, image-to-video, reference-to-video

Shengshu

ModelPublic operations
viduq2Text-to-video, reference-to-video
viduq2-proImage-to-video, reference-to-video, start-end-to-video
viduq2-pro-fastImage-to-video, start-end-to-video
viduq2-turboImage-to-video, start-end-to-video
viduq3-proText-to-video, image-to-video, start-end-to-video
viduq3-turboText-to-video, image-to-video, start-end-to-video

xAI

ModelPublic operations
grok-imagine-image-to-videoImage-to-video
grok-imagine-text-to-videoText-to-video
grok-imagine-upscaleVideo-to-video

Other

ModelPublic operations
topaz-video-upscaleVideo-to-video

Usage Examples

Text-to-Video

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "sora-2",
        "prompt": "A calm cinematic shot of a cat walking through a sunlit garden.",
        "operation": "text-to-video",
        "duration": 4,
        "aspect_ratio": "16:9"
    }
)

Image-to-Video

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "hailuo-2.3-standard",
        "prompt": "The scene begins from the provided image and adds gentle natural motion.",
        "operation": "image-to-video",
        "image_url": "https://example.com/portrait.jpg",
        "duration": 6,
        "aspect_ratio": "16:9"
    }
)

Reference-to-Video

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "veo3.1",
        "prompt": "Keep the same subject identity and palette while adding subtle motion.",
        "operation": "reference-to-video",
        "reference_images": [
            "https://example.com/ref-a.jpg",
            "https://example.com/ref-b.jpg"
        ],
        "duration": 8,
        "resolution": "720p",
        "aspect_ratio": "9:16"
    }
)

Start-End-to-Video

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "viduq2-pro",
        "prompt": "Smooth transition from day to night.",
        "operation": "start-end-to-video",
        "start_image": "https://example.com/city-day.jpg",
        "end_image": "https://example.com/city-night.jpg",
        "duration": 5,
        "resolution": "720p",
        "aspect_ratio": "16:9"
    }
)

Video-to-Video

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "topaz-video-upscale",
        "operation": "video-to-video",
        "video_url": "https://example.com/source.mp4",
        "prompt": "Upscale this clip while preserving the original motion."
    }
)

Motion Control

response = requests.post(f"{BASE}/videos/generations",
    headers=headers,
    json={
        "model": "kling-3.0-motion-control",
        "operation": "motion-control",
        "prompt": "Keep the subject stable while following the motion reference.",
        "image_url": "https://example.com/subject.png",
        "video_url": "https://example.com/motion.mp4",
        "resolution": "720p"
    }
)

Parameters Reference

ParameterTypeNotes
operationstringExplicit operation is recommended in production.
image_urlstringPreferred image input form for broad cross-model compatibility.
imagestringInline data URL; useful for debugging and small local integrations.
reference_imagesstring[]Canonical public field for reference-image conditioning.
reference_image_typestringOptional asset / style selector when supported.
video_urlstringRequired for current public video-to-video and motion-control models.
audio_urlstringUsed by model-specific audio-conditioned flows when available.
output_audiobooleanVeo 3 family defaults to true when omitted.

Model Selection Guide

Best Quality

veo3.1-pro, kling-video-o1-pro, and viduq3-pro are strong choices when fidelity matters more than speed.

Fastest Public Options

veo3.1-fast, hailuo-2.3-fast, and viduq3-turbo are good starting points for faster iteration.

Reference-Heavy Flows

Use veo3.1, veo3.1-fast, wan-2.6, or kling-video-o1-pro / std when you need dedicated reference-image conditioning.

Video-to-Video

topaz-video-upscale, grok-imagine-upscale, and kling-video-o1-pro / std cover the current generally enabled public video-to-video paths.

Billing

Billing is model-dependent. Some public video models are effectively priced per request, while others are priced per second. Check the Models page or the Pricing API for the current public price surface.