press1209/musicology-v1-beta | Run with an API on Replicate

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

🎵 Musicology Test v1 – Prototype Release

Overview

This is the v1 prototype of Musicology, our custom-trained generative audio model built specifically to simulate the musical characteristics of distinct eras and styles. It is designed for short-form audio generation using original, ethically sourced training data, with an emphasis on genre fidelity, legal safety, and cultural nuance.

Model Architecture Model Type: Transformer-based autoregressive decoder for music generation Tokenization: Multi-band audio codec encoding using a stacked quantizer structure Training Data: Curated original compositions labeled by genre, rhythm, and era characteristics Input Format: Text prompt (genre descriptor) → latent code sequence → waveform Output Duration: Optimized for short sequences (3–6 seconds) to test genre fidelity and temporal coherence

The architecture supports text-to-audio conditioning and has been fine-tuned to internal data distributions that reflect stylistic features of ’80s R&B and ’90s Hip-Hop, while maintaining a neutral baseline for future multi-genre expansion.

How to Use To generate a sample: Prompt: Use a genre-specific descriptor (e.g., “80s R&B”, “90s Hip-Hop”) Genre: Select the corresponding genre from the dropdown Duration: Select between 3–6 seconds for optimal performance in this version 📌 Best Practice: Use identical text for both the Prompt and Genre fields to ensure alignment with our token-label mappings during inference.

Key Properties ✅ Built on original data — no copyrighted or third-party music in training ✅ Latent space aligned with musical structure (rhythm, timbre, harmonic signature) ✅ Genre-specific conditioning with low-variance prompt compliance ✅ Fast inference with streaming-safe output for short-form generation’

Important Notes ⚡ The more you use the model, the better you’ll understand how to craft prompts for optimal outputs. Each generation helps you learn what works best with our training approach. This release is intended to help evaluate genre responsiveness and inform prompt engineering best practices for the upcoming full version.

Coming in v1.0 Support for longer sequences (up to 30s) Expanded artists and genre conditioning (incl. jazz, soul, cinematic, and experimental) Better prompt generalization with richer latent embeddings Streamlined generation interface for real-time use cases

Feedback & Usage This early release is for testing purposes only. Please do not attempt to run the model directly. Instead: Read this README for technical context Click the “Examples” button at the top of the page to hear pre-generated samples without incurring compute charges

We welcome feedback on both output quality and model responsiveness. Your input will directly inform our next milestone release.

– Musicology Engineering Team