Overview
For coding agents, discover the current recommended video shortlist first with
GET /v1/models?recommended_for=video, then send the selected model explicitly to this endpoint.poll_url, then poll for the result.
For the most reliable polling behavior, follow the exact
poll_url returned by the create request.If a create response returns
poll_url, call that exact URL. When it points to /v1/tasks/{id}, treat that as the canonical fixed status endpoint.The async task identifier may surface as
id or task_id depending on the adapter. Treat them as the same task identity.For production integrations, prefer publicly reachable
https URLs for images, videos, and audio. Inline data: URLs remain supported for compatible models, but large base64 payloads are harder to retry, inspect, and debug.Request Body
Video model ID. API default is
sora-2. See the Video Generation Guide for the current public model matrix and supported capabilities.Text description of the video to generate. Required for most public video models.
Video operation to run. Supported contract values include
text-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video, video-extension, audio-to-video, and motion-control. LemonData can infer the operation from the supplied inputs, but explicit operation is recommended for production reliability.Publicly accessible URL of the starting image for image-to-video generation. For best cross-model compatibility, prefer
image_url.Inline image as a data URL (for example,
data:image/jpeg;base64,...). Supported by compatible models, but image_url provides the broadest compatibility across public video models.Reference image inputs for models that support dedicated reference conditioning. Provide up to 3 items. Public
https URLs are recommended; compatible models also accept inline data: URLs.Optional reference role for models that distinguish between
asset and style references.Publicly accessible URL of the source video. Required for
video-to-video style flows and for motion-control models that combine a subject image with a motion reference video.Publicly accessible audio URL for models that support
audio-to-video.Provider-specific task identifier used by some continuation, extension, or derivative flows.
Model-specific extension start offset used by some
video-extension flows.Model-specific extension multiplier or repeat count used by some
video-extension flows.Video duration in seconds (model-dependent).
Aspect ratio (for example,
16:9, 9:16, or 1:1).Model-dependent output resolution (for example,
720p, 1080p, or 4k).Model-dependent audio output toggle. In LemonData, Veo 3 family requests default to
true when this field is omitted. Other public video models follow their governed default behavior. The camelCase alias outputAudio is accepted for compatibility.Frames per second (1-120) for models that expose FPS control.
What to avoid in the generated video.
Random seed for reproducible generation.
Prompt adherence strength (0-20) for models that expose CFG-style control.
Motion intensity (0-1) for models that expose it.
URL or compatible image input for the first frame in
start-end-to-video.URL or compatible image input for the last frame in
start-end-to-video.Model-specific size tier used by some OpenAI-compatible video models.
Optional watermark toggle for models that expose it.
Model-specific effect selector for specialized editing flows.
A unique identifier for the end-user.
Compatibility Notes
- Canonical public fields are snake_case:
reference_images,reference_image_type, andoutput_audio. - For compatibility, LemonData also accepts the camelCase aliases
referenceImages,referenceImageType, andoutputAudio. - If
operationis omitted, LemonData infers it from the supplied inputs. For production traffic, explicitoperationis still recommended.
Media Input Best Practices
- Prefer publicly reachable
httpsURLs over inline base64 forimage_url,reference_images,video_url, andaudio_url. - Avoid mixing inline base64 and remote URLs in the same request when possible; using one representation per request is easier to reason about and debug.
- If you use signed URLs, keep them valid long enough to cover retries and asynchronous task creation.
Response
Canonical async task identifier. Treat this as the same identity as
task_id when both are present.Canonical async task identifier for polling. This is the same task identity used by async status endpoints.
Preferred polling URL for this task. Use this exact path when checking status.
Initial status:
pending.Unix timestamp when the task was created.
Model used.
Direct video URL when the result is already available.
Single video payload with
url, duration, width, and height when available.Multiple video payloads when the provider returns more than one output.
Error message or structured error object when the task fails.
Image to Video
Reference to Video
Useoperation=reference-to-video when the model supports dedicated reference-image conditioning. For LemonData’s public contract, pass reference assets through reference_images.
Keyframe Control
Usestart_image and end_image to control the first and last frames:
Video to Video
Useoperation=video-to-video when the model accepts an existing video as the primary input.
Motion Control
Useoperation=motion-control when the model expects both a subject image and a motion reference video. LemonData maps the public image_url + video_url request shape to the upstream motion-control contract.
Audio-to-Video and Video Extension Availability
LemonData’s public contract acceptsaudio-to-video and video-extension for model-specific flows, but the current generally enabled public video model list in this docs build does not include a broad public model that advertises either capability. Use the Models API or the Models page to confirm current availability before integrating those operations.
Currently Enabled Public Video Models
This list is aligned with the current enabled public video model inventory in this docs build. For the freshest state, query the Models API.
OpenAI
| Model | Public operations |
|---|---|
sora-2 | Text-to-video, image-to-video |
sora-2-pro | Text-to-video, image-to-video |
sora-2-pro-storyboard | Image-to-video |
Kuaishou
| Model | Public operations |
|---|---|
kling-3.0-motion-control | Motion control |
kling-3.0-video | Text-to-video, image-to-video, start-end-to-video |
kling-v2.5-turbo-pro | Text-to-video, image-to-video, start-end-to-video |
kling-v2.5-turbo-std | Text-to-video, image-to-video |
kling-v2.6-pro | Text-to-video, image-to-video, start-end-to-video |
kling-v2.6-std | Text-to-video, image-to-video |
kling-v3.0-pro | Text-to-video, image-to-video, start-end-to-video |
kling-v3.0-std | Text-to-video, image-to-video, start-end-to-video |
kling-video-o1-pro | Text-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video |
kling-video-o1-std | Text-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video |
| Model | Public operations |
|---|---|
veo3 | Text-to-video, image-to-video |
veo3-fast | Text-to-video, image-to-video |
veo3-pro | Text-to-video, image-to-video |
veo3.1 | Text-to-video, image-to-video, reference-to-video, start-end-to-video |
veo3.1-fast | Text-to-video, image-to-video, reference-to-video, start-end-to-video |
veo3.1-pro | Text-to-video, image-to-video, start-end-to-video |
ByteDance
| Model | Public operations |
|---|---|
seedance-1.5-pro | Text-to-video, image-to-video |
MiniMax
| Model | Public operations |
|---|---|
hailuo-2.3-fast | Image-to-video |
hailuo-2.3-pro | Text-to-video, image-to-video |
hailuo-2.3-standard | Text-to-video, image-to-video |
Alibaba
| Model | Public operations |
|---|---|
wan-2.2-plus | Text-to-video, image-to-video |
wan-2.5 | Text-to-video, image-to-video |
wan-2.6 | Text-to-video, image-to-video, reference-to-video |
Shengshu
| Model | Public operations |
|---|---|
viduq2 | Text-to-video, reference-to-video |
viduq2-pro | Image-to-video, reference-to-video, start-end-to-video |
viduq2-pro-fast | Image-to-video, start-end-to-video |
viduq2-turbo | Image-to-video, start-end-to-video |
viduq3-pro | Text-to-video, image-to-video, start-end-to-video |
viduq3-turbo | Text-to-video, image-to-video, start-end-to-video |
xAI
| Model | Public operations |
|---|---|
grok-imagine-image-to-video | Image-to-video |
grok-imagine-text-to-video | Text-to-video |
grok-imagine-upscale | Video-to-video |
Other
| Model | Public operations |
|---|---|
topaz-video-upscale | Video-to-video |