You're looking at a specific version of this model. Jump to the model overview.

haoheliu /audio-ldm:b61392ad

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
text
string
Text prompt from which to generate audio
duration
string (enum)
5.0

Options:

2.5, 5.0, 7.5, 10.0, 12.5, 15.0, 17.5, 20.0

Duration of the generated audio (in seconds). Higher duration may OOM.
guidance_scale
number
2.5
Guidance scale for the model. (Large scale -> better quality and relavancy to text; small scale -> better diversity)
random_seed
integer
Random seed for the model (optional)
n_candidates
integer
3
Return the best of n different candidate audios

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}