Description: A powerful conversational text-to-speech model with the ability to mix Chinese and English, and support for multiple speakers. It can be configured for six languages including Chinese, English, and Japanese.
Description: An open-source text-to-speech model supporting speech synthesis in over 7000 languages, with multi-speaker capabilities and the ability to simulate rhythm, stress, and intonation.
Description: An open-source TTS model supporting Chinese, English, and Japanese, with voice processing close to human level, trained with about 150,000 hours of trilingual data.
Description: An open-source TTS model by Alibaba, designed to facilitate natural interaction between humans and LLMs through voice understanding and generation.
Description: A lightweight text-to-speech model generating high-quality, natural speech in the style of a given speaker (gender, pitch, speaking style, etc.).
Description: An open-source TTS from Shanghai Jiao Tong University/Cambridge offering zero-shot voice cloning, real-time inference, and support for speech speed control and seamless transitions between languages/dialects.
Description: A zero-shot, fully non-autoregressive TTS model supporting cross-lingual dubbing, voice cloning, language conversion, and emotion control.
Description: An open-source TTS model supporting six languages: English, Japanese, Korean, Chinese, French, and German, with enhanced naturalness and coherence through punctuation support.