zsxkib/dia:46ad4a48 | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

zsxkib /dia:46ad4a48

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
text	string		Input text for dialogue generation. Use [S1], [S2] to indicate different speakers and (description) in parentheses for non-verbal cues e.g., (laughs), (whispers).
audio_prompt	string		Optional audio file (.wav/.mp3/.flac) for voice cloning. The model will attempt to mimic this voice style.
max_new_tokens	integer	3072 Min: 500 Max: 4096	Controls the length of generated audio. Higher values create longer audio. (86 tokens ≈ 1 second of audio).
max_audio_prompt_seconds	integer	10 Min: 1 Max: 120	Maximum duration in seconds for the input voice cloning audio prompt. Only used when an audio prompt is provided. Longer voice samples will be truncated to this length.
cfg_scale	number	3 Min: 1 Max: 5	Controls how closely the audio follows your text. Higher values (3-5) follow text more strictly; lower values may sound more natural but deviate more.
temperature	number	1.3 Max: 2	Controls randomness in generation. Higher values (1.3-2.0) increase variety; lower values make output more consistent. Set to 0 for deterministic (greedy) generation.
top_p	number	0.95 Min: 0.1 Max: 1	Controls diversity of word choice. Higher values include more unusual options. Most users shouldn't need to adjust this parameter.
cfg_filter_top_k	integer	35 Min: 10 Max: 100	Technical parameter for filtering audio generation tokens. Higher values allow more diverse sounds; lower values create more consistent audio.
speed_factor	number	0.94 Min: 0.5 Max: 1.5	Adjusts playback speed of the generated audio. Values below 1.0 slow down the audio; 1.0 is original speed.
seed	integer		Random seed for reproducible results. Use the same seed value to get the same output for identical inputs. Leave blank for random results each time.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'format': 'uri', 'title': 'Output', 'type': 'string'}