Official

minimax / speech-02-hd

Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Optimized for high-fidelity applications like voiceovers and audiobooks.

  • Public
  • 2.6K runs
  • License

Speech-02-series

Speech-02-series is a Text-to-Audio (T2A) and voice cloning technology that offers voice synthesis, emotional expression, and multilingual capabilities.

Models

  • Speech-02-HD: Optimized for high-fidelity applications like voiceovers and audiobooks
  • Speech-02-Turbo: Designed for real-time applications with low latency
  • Voice-Cloning: Clone voices for use with speech-02-hd and speech-02-turbo

Key Features

Voice Cloning

  • 10-second voice cloning with 99% reported vocal similarity
  • 300+ pre-built voices across different demographics
  • Controls for pitch, speed, and volume

Emotion Control

  • Auto-detect mode that matches emotional tone to text context
  • Manual customization options for emotional expression

Language Support

  • 30+ languages with native accents
  • English variants: US, UK, Australian, Indian
  • Asian languages: Mandarin, Cantonese, Japanese, Korean, Vietnamese, Indonesian
  • European languages: French, German, Spanish, Portuguese (Brazilian), Turkish, Russian, Ukrainian
  • Recently added: Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi

Technical Specifications

Deployment

  • Virtual machine and private cloud deployment options
  • Isolated environment for security and privacy

Privacy policy

Data from this model is sent from Replicate to MiniMax.

Check their Privacy Policy for details:

https://intl.minimaxi.com/protocol/privacy-policy

Terms of Service

https://intl.minimaxi.com/protocol/terms-of-service