Overview
LemonData provides access to video generation models through a single unified API. Video generation is asynchronous : submit a request, receive a task ID and poll_url, then poll for the final result.
The model inventory changes over time. For the latest public availability, use the Models API or visit the Models page .
If a create response returns poll_url, call that exact URL. When it points to /v1/tasks/{id}, treat that as the canonical fixed status endpoint.
Audio behavior is model-dependent. In LemonData, Veo 3 family requests default to audio-on when output_audio is omitted. Some public models are silent-only or do not expose a stable toggle.
For production integrations, prefer publicly reachable https URLs over inline base64 for images, videos, and audio. Inline data: URLs are still supported by compatible models, but URLs are easier to retry, inspect, and debug.
Async Workflow
Public Operations
LemonData’s current public video contract centers on these operations:
text-to-video
image-to-video
reference-to-video
start-end-to-video
video-to-video
motion-control
The request contract also accepts audio-to-video and video-extension for model-specific flows, but the current generally enabled public model inventory in this docs build does not include a broadly enabled model that advertises either capability.
Capability Matrix
Legend : ✅ Supported by at least one currently enabled public model in that provider family | ❌ Not currently represented by an enabled public model
Series T2V I2V Reference Start-End V2V Motion OpenAI ✅ ✅ ❌ ❌ ❌ ❌ Kuaishou ✅ ✅ ✅ ✅ ✅ ✅ Google ✅ ✅ ✅ ✅ ❌ ❌ ByteDance ✅ ✅ ❌ ❌ ❌ ❌ MiniMax ✅ ✅ ❌ ❌ ❌ ❌ Alibaba ✅ ✅ ✅ ❌ ❌ ❌ Shengshu ✅ ✅ ✅ ✅ ❌ ❌ xAI ✅ ✅ ❌ ❌ ✅ ❌ Other ❌ ❌ ❌ ❌ ✅ ❌
Capability Definitions
T2V (Text-to-Video) : Generate video from a text prompt.
I2V (Image-to-Video) : Animate a starting image. For the broadest compatibility, provide image_url.
Reference : Condition generation on one or more reference images via reference_images.
Start-End : Control the first and last frames with start_image and end_image.
V2V (Video-to-Video) : Use an existing video as the primary source input.
Motion : Combine a subject image with a motion reference video.
Current Public Model Inventory
OpenAI
Model Public operations sora-2Text-to-video, image-to-video sora-2-proText-to-video, image-to-video sora-2-pro-storyboardImage-to-video
Kuaishou
Model Public operations kling-3.0-motion-controlMotion control kling-3.0-videoText-to-video, image-to-video, start-end-to-video kling-v2.5-turbo-proText-to-video, image-to-video, start-end-to-video kling-v2.5-turbo-stdText-to-video, image-to-video kling-v2.6-proText-to-video, image-to-video, start-end-to-video kling-v2.6-stdText-to-video, image-to-video kling-v3.0-proText-to-video, image-to-video, start-end-to-video kling-v3.0-stdText-to-video, image-to-video, start-end-to-video kling-video-o1-proText-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video kling-video-o1-stdText-to-video, image-to-video, reference-to-video, start-end-to-video, video-to-video
Google
Model Public operations veo3Text-to-video, image-to-video veo3-fastText-to-video, image-to-video veo3-proText-to-video, image-to-video veo3.1Text-to-video, image-to-video, reference-to-video, start-end-to-video veo3.1-fastText-to-video, image-to-video, reference-to-video, start-end-to-video veo3.1-proText-to-video, image-to-video, start-end-to-video
ByteDance
Model Public operations seedance-1.5-proText-to-video, image-to-video
MiniMax
Model Public operations hailuo-2.3-fastImage-to-video hailuo-2.3-proText-to-video, image-to-video hailuo-2.3-standardText-to-video, image-to-video
Alibaba
Model Public operations wan-2.2-plusText-to-video, image-to-video wan-2.5Text-to-video, image-to-video wan-2.6Text-to-video, image-to-video, reference-to-video
Shengshu
Model Public operations viduq2Text-to-video, reference-to-video viduq2-proImage-to-video, reference-to-video, start-end-to-video viduq2-pro-fastImage-to-video, start-end-to-video viduq2-turboImage-to-video, start-end-to-video viduq3-proText-to-video, image-to-video, start-end-to-video viduq3-turboText-to-video, image-to-video, start-end-to-video
xAI
Model Public operations grok-imagine-image-to-videoImage-to-video grok-imagine-text-to-videoText-to-video grok-imagine-upscaleVideo-to-video
Other
Model Public operations topaz-video-upscaleVideo-to-video
Usage Examples
Text-to-Video
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "sora-2" ,
"prompt" : "A calm cinematic shot of a cat walking through a sunlit garden." ,
"operation" : "text-to-video" ,
"duration" : 4 ,
"aspect_ratio" : "16:9"
}
)
Image-to-Video
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "hailuo-2.3-standard" ,
"prompt" : "The scene begins from the provided image and adds gentle natural motion." ,
"operation" : "image-to-video" ,
"image_url" : "https://example.com/portrait.jpg" ,
"duration" : 6 ,
"aspect_ratio" : "16:9"
}
)
Reference-to-Video
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "veo3.1" ,
"prompt" : "Keep the same subject identity and palette while adding subtle motion." ,
"operation" : "reference-to-video" ,
"reference_images" : [
"https://example.com/ref-a.jpg" ,
"https://example.com/ref-b.jpg"
],
"duration" : 8 ,
"resolution" : "720p" ,
"aspect_ratio" : "9:16"
}
)
Start-End-to-Video
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "viduq2-pro" ,
"prompt" : "Smooth transition from day to night." ,
"operation" : "start-end-to-video" ,
"start_image" : "https://example.com/city-day.jpg" ,
"end_image" : "https://example.com/city-night.jpg" ,
"duration" : 5 ,
"resolution" : "720p" ,
"aspect_ratio" : "16:9"
}
)
Video-to-Video
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "topaz-video-upscale" ,
"operation" : "video-to-video" ,
"video_url" : "https://example.com/source.mp4" ,
"prompt" : "Upscale this clip while preserving the original motion."
}
)
Motion Control
response = requests.post( f " { BASE } /videos/generations" ,
headers = headers,
json = {
"model" : "kling-3.0-motion-control" ,
"operation" : "motion-control" ,
"prompt" : "Keep the subject stable while following the motion reference." ,
"image_url" : "https://example.com/subject.png" ,
"video_url" : "https://example.com/motion.mp4" ,
"resolution" : "720p"
}
)
Parameters Reference
Parameter Type Notes operationstring Explicit operation is recommended in production. image_urlstring Preferred image input form for broad cross-model compatibility. imagestring Inline data URL; useful for debugging and small local integrations. reference_imagesstring[] Canonical public field for reference-image conditioning. reference_image_typestring Optional asset / style selector when supported. video_urlstring Required for current public video-to-video and motion-control models. audio_urlstring Used by model-specific audio-conditioned flows when available. output_audioboolean Veo 3 family defaults to true when omitted.
Model Selection Guide
Best Quality veo3.1-pro , kling-video-o1-pro , and viduq3-pro are strong choices when fidelity matters more than speed.
Fastest Public Options veo3.1-fast , hailuo-2.3-fast , and viduq3-turbo are good starting points for faster iteration.
Reference-Heavy Flows Use veo3.1 , veo3.1-fast , wan-2.6 , or kling-video-o1-pro / std when you need dedicated reference-image conditioning.
Video-to-Video topaz-video-upscale , grok-imagine-upscale , and kling-video-o1-pro / std cover the current generally enabled public video-to-video paths.
Billing
Billing is model-dependent. Some public video models are effectively priced per request, while others are priced per second. Check the Models page or the Pricing API for the current public price surface.