Create Translation

curl -X POST "https://api.lemondata.cc/v1/audio/translations" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "file=@german_audio.mp3" \
  -F "model=whisper-1"

{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you from?"
}

Overview

Translates audio in any supported language into English text. Unlike transcription, this endpoint always outputs English text regardless of the input language.

This page documents audio translation (POST /v1/audio/translations). For text translation, use POST /v1/translations.

Do not use recommended_for=translation for this endpoint. That recommendation scene is reserved for text translation models on POST /v1/translations.

Request Body

Synchronous request timeout: This non-chat endpoint waits for the routed model to finish. Large inputs, long audio, or large batches can exceed common 30s client defaults, so set your HTTP client timeout to at least 120s.

file

required

The audio file to translate. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm. Maximum file size is 25 MB.

model

string

default:"whisper-1"

The model to use. Currently only whisper-1 is supported.

prompt

string

An optional text to guide the model’s style or continue a previous segment. Should be in English.

response_format

string

default:"json"

The format of the output. Options: json, text, srt, verbose_json, vtt.

temperature

number

The sampling temperature, between 0 and 1. Higher values like 0.8 produce more random output, while lower values like 0.2 make output more focused and deterministic.

Response

text

string

The translated text in English.

For verbose_json format, the response also includes:

language

string

The detected language of the input audio.

duration

number

The duration of the input audio in seconds.

segments

array

Segments of the translated text with timestamps.

curl -X POST "https://api.lemondata.cc/v1/audio/translations" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "file=@german_audio.mp3" \
  -F "model=whisper-1"

{
  "text": "Hello, my name is Wolfgang and I come from Germany. Where are you from?"
}

Translation vs Transcription

Feature	Translation	Transcription
Output language	Always English	Same as input
Use case	Convert foreign audio to English	Preserve original language
Language parameter	Not applicable	Optional hint

The translation endpoint automatically detects the source language and translates to English. The language parameter from transcription is ignored.

Create Transcription Create Music

Core

Cache

Text

Files & Batches

Images & Media

Async Jobs

Gemini Native

Management

Create Translation

Overview

Request Body

Response

Translation vs Transcription

Core

Cache

Text

Files & Batches

Images & Media

Async Jobs

Gemini Native

Management

Documentation Index

​Overview

​Request Body

​Response

​Translation vs Transcription

Overview

Request Body

Response

Translation vs Transcription