lucataco/musubi-tuner | Readme and Docs

About

Cog implementation of Kohya-ss/musubi-tuner, a fine tuning script for training HunyuanVideo LoRA models

How to use

This model expects a zip file that contains at least 8 video and caption file pairs. Example file pairing: segment1.mp4 & segment1.txt
Videos should be around 544x960 and 2 seconds in length each. Captions should be more than 50 words each
For help with captioning videos see our collection here: video-to-text
For help with splitting up videos to desired width height and duration, see this model: lucataco/video-split

Run your Trained LoRA

Use the model zsxkib/hunyuan-video-lora to run your HunyuanVideo LoRA.

Convert to use with ComfyUI

After training a LoRA (ex: lucataco/hunyuan-musubi-rose-6), if you want to use the LoRA locally you’ll need to convert it to a ComfyUI compatible format(lucataco/hunyuan-musubi-rose-6-comfyui).

Convert your Musubi LoRA to a ComfyUI compatible format with the following model: lucataco/musubi-tuner-lora-converter

Train & Create a new Replicate Model

If you’re already familiar with the Replicate Flux Trainer: ostris/flux-dev-lora-trainer which creates a new Replicate model using the trained LoRA, you might be interested in a similar model for HunyuanVideo LoRAs: hunyuan-video-lora