You're looking at a specific version of this model. Jump to the model overview.
zsxkib /step-video-t2v:4acfc436
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
Field | Type | Default value | Description |
---|---|---|---|
prompt |
string
|
A robot dancing in Times Square
|
Input text prompt describing the video content
|
num_frames |
integer
|
24
Min: 24 Max: 204 |
Number of frames in output video (24-204)
|
infer_steps |
integer
|
12
Min: 10 Max: 15 |
Number of denoising steps (10-15 for Turbo model)
|
cfg_scale |
number
|
5
Min: 3 Max: 7 |
Guidance scale for text conditioning (3.0-7.0)
|
time_shift |
number
|
17
Min: 15 Max: 20 |
Temporal shift for motion consistency (15.0-20.0)
|
height |
integer
|
544
Min: 256 Max: 1088 |
Vertical resolution (256-1088, multiple of 16)
|
width |
integer
|
992
Min: 256 Max: 1984 |
Horizontal resolution (256-1984, multiple of 16)
|
fps |
integer
|
24
Min: 12 Max: 60 |
Output video frame rate (12-60)
|
pos_magic |
number
|
1
Min: 0.5 Max: 1.5 |
Positive prompt enhancement strength (0.5-1.5)
|
neg_magic |
number
|
1
Min: 0.5 Max: 1.5 |
Negative prompt suppression strength (0.5-1.5)
|
seed |
integer
|
Random seed for reproducibility
|
Output schema
The shape of the response you’ll get when you run this model with an API.
{'format': 'uri', 'title': 'Output', 'type': 'string'}