zoharbarzilai/drumtest2 | Readme and Docs

How to Use the Model

The model takes your audio input and runs it through a hidden four-stage generative recipe. Each stage applies different parameters to create a unique variation, giving you a range of creative options from a single prediction.

Inputs:

prompt_prefix (string, optional): Add descriptive text to guide the style of all four generated outputs. For example, “A heavy metal blast beat” or “A tight, funky hip-hop groove”.

init_audio (file): The initial audio or video file to be transformed. This is the source material for the generation.

pitch_shift_semitones (integer, optional): Pitch-shift the initial audio up or down before it enters the generative process.

normalize (boolean, optional): Apply loudness normalization to the initial audio for a consistent input level.

The model will return four separate audio files, each a different interpretation based on the internal recipe.