Readme

Create Video Dataset

A tool to easily prepare video datasets with automatic captioning for AI training. This tool processes videos (from URLs or local files), generates high-quality captions using QWEN-VL, and packages everything into a training-ready format.

Features

🎥 Process YouTube URLs or local video files
🤖 Automatic video captioning using QWEN-VL
✍️ Support for custom captions
🏷️ Configurable trigger words for training
📝 Prefix/suffix support for caption formatting
🗃️ Clean output in zip format

Input Parameters

Parameter	Description	Default
`video_url`	YouTube/video URL to process	None
`video_file`	Local video file to process	None
`trigger_word`	Training trigger word (e.g., TOK, STYLE3D)	“TOK”
`autocaption`	Use AI to generate captions	True
`custom_caption`	Your custom caption (required if autocaption=False)	None
`autocaption_prefix`	Text to add before captions	None
`autocaption_suffix`	Text to add after captions	None

Output

The tool produces a zip file containing: - Processed video file - Caption files (.txt) for each video - Proper directory structure for training

Run time and cost

Readme

Create Video Dataset

Features

Input Parameters

Output