You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
audio_file |
string
|
Audio file (Input option #1)
|
|
audio_url |
string
|
Direct audio url. (Input option #2)
|
|
file_extension |
string
|
.wav
|
Extension of the audio file (if audio_url is used)
|
batch_size |
integer
|
32
|
Parallelization of input audio transcription
|
task |
string
|
transcribe
|
Task: transcribe or translate
|
language |
string
|
Original language of the audio (reduces hallucinations). Leave empty to detect automatically
|
|
only_text |
boolean
|
False
|
Set if you only want to return text; otherwise, segment metadata will be returned as well.
|
align_output |
boolean
|
False
|
Use if you need word-level timing and not just batched transcription
|
diarize |
boolean
|
False
|
Diarize the result
|
debug |
boolean
|
False
|
Debugging purposes
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output', 'type': 'string'}