lightweight text-to-speech (TTS) model, trained on 10.5K hours of audio data
Want to make some of these yourself?