turian/insanely-fast-whisper-with-video:3f035e6e

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
audio	string		Audio file. Either this or url must be provided.
url	string		Video URL for yt-dlp to download the audio from. Either this or audio must be provided.
task	None	transcribe	Task to perform: transcribe or translate to another language. (default: transcribe).
language	string		Optional. Language spoken in the audio, specify None to perform language detection.
batch_size	integer	24	Number of parallel batches you want to compute. Reduce if you face OOMs. (default: 24).
timestamp	None	chunk	Whisper supports both chunked as well as word level timestamps. (default: chunk).
diarise_audio	boolean	False	Use Pyannote.audio to diarise the audio clips. You will need to provide hf_token below too.
hf_token	string		Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips. You need to agree to the terms in 'https://huggingface.co/pyannote/speaker-diarization-3.1' and 'https://huggingface.co/pyannote/segmentation-3.0' first.

The shape of the response you’ll get when you run this model with an API.

Schema

{'title': 'Output'}