You're looking at a specific version of this model. Jump to the model overview.

zsxkib /step-video-t2v:4acfc436

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
prompt
string
A robot dancing in Times Square
Input text prompt describing the video content
num_frames
integer
24

Min: 24

Max: 204

Number of frames in output video (24-204)
infer_steps
integer
12

Min: 10

Max: 15

Number of denoising steps (10-15 for Turbo model)
cfg_scale
number
5

Min: 3

Max: 7

Guidance scale for text conditioning (3.0-7.0)
time_shift
number
17

Min: 15

Max: 20

Temporal shift for motion consistency (15.0-20.0)
height
integer
544

Min: 256

Max: 1088

Vertical resolution (256-1088, multiple of 16)
width
integer
992

Min: 256

Max: 1984

Horizontal resolution (256-1984, multiple of 16)
fps
integer
24

Min: 12

Max: 60

Output video frame rate (12-60)
pos_magic
number
1

Min: 0.5

Max: 1.5

Positive prompt enhancement strength (0.5-1.5)
neg_magic
number
1

Min: 0.5

Max: 1.5

Negative prompt suppression strength (0.5-1.5)
seed
integer
Random seed for reproducibility

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'format': 'uri', 'title': 'Output', 'type': 'string'}