You're looking at a specific version of this model. Jump to the model overview.
zsxkib /realistic-voice-cloning:ab6f63c8
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
song_input |
string
|
Link to a song on YouTube or path to a local audio file. Should be enclosed in double quotes for Windows and single quotes for Unix-like systems.
|
|
model_dir_name |
string
|
Name of folder in rvc_models directory containing your .pth and .index files for a specific voice.
|
|
pitch_change |
number
|
0
|
Change pitch of AI vocals in octaves. Set to 0 for no change. Generally, use 1 for male to female conversions and -1 for vice-versa.
|
keep_files |
boolean
|
False
|
Can be added to keep all intermediate audio files generated. e.g. Isolated AI vocals/instrumentals. Leave out to save space.
|
index_rate |
number
|
0.5
|
Control how much of the AI's accent to leave in the vocals. 0 <= INDEX_RATE <= 1.
|
filter_radius |
integer
|
3
|
If >=3: apply median filtering median filtering to the harvested pitch results. 0 <= FILTER_RADIUS <= 7.
|
rms_mix_rate |
number
|
0.25
|
Control how much to use the original vocal's loudness (0) or a fixed loudness (1). 0 <= RMS_MIX_RATE <= 1.
|
pitch_detection_algo |
string
(enum)
|
rmvpe
Options: rmvpe, mangio-crepe |
Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals).
|
crepe_hop_length |
integer
|
128
|
Controls how often it checks for pitch changes in milliseconds when using `mangio-crepe` algo specifically. Lower values leads to longer conversions and higher risk of voice cracks, but better pitch accuracy.
|
protect |
number
|
0.33
|
Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable. 0 <= PROTECT <= 0.5.
|
main_vocals_volume_change |
number
|
0
|
Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels.
|
backup_vocals_volume_change |
number
|
0
|
Control volume of backup AI vocals.
|
instrumental_volume_change |
number
|
0
|
Control volume of the background music/instrumentals.
|
pitch_change_all |
number
|
0
|
Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly.
|
reverb_size |
number
|
0.15
Max: 1 |
The larger the room, the longer the reverb time. 0 <= REVERB_SIZE <= 1.
|
reverb_wetness |
number
|
0.2
Max: 1 |
Level of AI vocals with reverb. 0 <= REVERB_WETNESS <= 1.
|
reverb_dryness |
number
|
0.8
Max: 1 |
Level of AI vocals without reverb. 0 <= REVERB_DRYNESS <= 1.
|
reverb_damping |
number
|
0.7
Max: 1 |
Absorption of high frequencies in the reverb. 0 <= REVERB_DAMPING <= 1.
|
output_format |
string
(enum)
|
mp3
Options: mp3, wav |
wav for best quality and large file size, mp3 for decent quality and small file size.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}