You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
image |
string
|
Input portrait image (will be cropped if face is detected).
|
|
audio |
string
|
Input audio file (WAV, MP3, etc.) for the voice.
|
|
dynamic_scale |
number
|
1
Min: 0.5 Max: 2 |
Controls movement intensity. Increase/decrease for more/less movement.
|
min_resolution |
integer
|
512
Min: 256 Max: 1024 |
Minimum image resolution for processing. Lower values use less memory but may reduce quality.
|
inference_steps |
integer
|
25
Min: 5 Max: 50 |
Number of diffusion steps. Higher values may improve quality but take longer.
|
keep_resolution |
boolean
|
False
|
If true, output video matches the original image resolution. Otherwise uses the min_resolution after cropping.
|
seed |
integer
|
Random seed for reproducible results. Leave blank for a random seed.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}