You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
text |
string
|
Text for TTS generation - REQUIRED in both modes (要转换为语音的文本 - 两种模式下都必需)
|
|
mode |
string
(enum)
|
voice_creation
Options: voice_cloning, voice_creation |
TTS mode: voice cloning requires a prompt audio file to mimic the voice; voice creation generates speech with specified gender/pitch/speed parameters. (TTS模式:声音克隆需要提供语音样本来模仿声音;声音创建使用指定的性别/音高/语速参数生成语音)
|
prompt_speech_path |
string
|
[Voice Cloning] Path to the prompt audio file - REQUIRED in voice cloning mode (声音克隆模式:提示音频文件路径 - 声音克隆模式下必需)
|
|
prompt_text |
string
|
|
[Voice Cloning] Transcript of prompt audio - Optional but improves quality (声音克隆模式:提示音频的文本转录 - 可选,但提供可提高质量)
|
gender |
string
(enum)
|
female
Options: male, female |
[Voice Creation] Voice gender - REQUIRED in voice creation mode (声音创建模式:声音性别 - 声音创建模式下必需)
|
pitch |
string
(enum)
|
moderate
Options: very_low, low, moderate, high, very_high |
[Voice Creation] Voice pitch level - REQUIRED in voice creation mode (声音创建模式:声音音高 - 声音创建模式下必需)
|
speed |
string
(enum)
|
moderate
Options: very_low, low, moderate, high, very_high |
[Voice Creation] Speaking speed - REQUIRED in voice creation mode (声音创建模式:说话速度 - 声音创建模式下必需)
|
temperature |
number
|
0.8
|
Sampling temperature (0.0-1.0) - Controls randomness in generation (采样温度 - 控制生成的随机性)
|
top_k |
integer
|
50
|
Top-k sampling parameter - Limits the token selection to top k options (Top-k采样参数 - 将令牌选择限制为前k个选项)
|
top_p |
number
|
0.95
|
Top-p sampling parameter - Nucleus sampling probability threshold (Top-p采样参数 - 核采样概率阈值)
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}