Documentation Index
Fetch the complete documentation index at: https://docs.lemondata.cc/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Translates audio in any supported language into English text. Unlike transcription, this endpoint always outputs English text regardless of the input language.This page documents audio translation (
POST /v1/audio/translations). For text translation, use POST /v1/translations.Do not use
recommended_for=translation for this endpoint. That recommendation scene is reserved for text translation models on POST /v1/translations.Request Body
Synchronous request timeout: This non-chat endpoint waits for the routed model to finish. Large inputs, long audio, or large batches can exceed common 30s client defaults, so set your HTTP client timeout to at least120s.
The audio file to translate. Supported formats:
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Maximum file size is 25 MB.The model to use. Currently only
whisper-1 is supported.An optional text to guide the model’s style or continue a previous segment. Should be in English.
The format of the output. Options:
json, text, srt, verbose_json, vtt.The sampling temperature, between 0 and 1. Higher values like 0.8 produce more random output, while lower values like 0.2 make output more focused and deterministic.
Response
The translated text in English.
verbose_json format, the response also includes:
The detected language of the input audio.
The duration of the input audio in seconds.
Segments of the translated text with timestamps.
Translation vs Transcription
| Feature | Translation | Transcription |
|---|---|---|
| Output language | Always English | Same as input |
| Use case | Convert foreign audio to English | Preserve original language |
| Language parameter | Not applicable | Optional hint |
The translation endpoint automatically detects the source language and translates to English. The
language parameter from transcription is ignored.