LogoWTAI Navigation
Blog Post Image

Open source text-to-speech series, open source TTS series, Text-to-Speech series

Open source text-to-speech series, open source TTS series, Text-to-Speech series.

Open-Source Text-to-Speech (TTS) Models

1. ChatTTS

  • Description: A powerful conversational text-to-speech model with the ability to mix Chinese and English, and support for multiple speakers. It can be configured for six languages including Chinese, English, and Japanese.
  • Demo: Details and demo

2. ToucanTTS

  • Description: An open-source text-to-speech model supporting speech synthesis in over 7000 languages, with multi-speaker capabilities and the ability to simulate rhythm, stress, and intonation.
  • Demo: Details and demo

3. Fish Speech

  • Description: An open-source TTS model supporting Chinese, English, and Japanese, with voice processing close to human level, trained with about 150,000 hours of trilingual data.
  • Demo: Details and demo

4. FunAudioLLM

  • Description: An open-source TTS model by Alibaba, designed to facilitate natural interaction between humans and LLMs through voice understanding and generation.
  • Demo: Details and demo

5. Parler-TTS

  • Description: A lightweight text-to-speech model generating high-quality, natural speech in the style of a given speaker (gender, pitch, speaking style, etc.).
  • Demo: Details and demo

6. F5-TTS

  • Description: An open-source TTS from Shanghai Jiao Tong University/Cambridge offering zero-shot voice cloning, real-time inference, and support for speech speed control and seamless transitions between languages/dialects.
  • Demo: Details and demo

7. MaskGCT

  • Description: A zero-shot, fully non-autoregressive TTS model supporting cross-lingual dubbing, voice cloning, language conversion, and emotion control.
  • Demo: Details and demo

8. Smol TTS

  • Description: An open-source TTS model based on the LLaMa architecture, offering zero-shot voice cloning.
  • Demo: Details and demo

9. Kokoro

  • Description: An open-source TTS model with 82 million parameters, trained on less than 100 hours of audio, supporting multiple languages.
  • Demo: Details and demo

10. OuteTTS

  • Description: An open-source TTS model supporting six languages: English, Japanese, Korean, Chinese, French, and German, with enhanced naturalness and coherence through punctuation support.
  • Demo: Details and demo

11. Llasa

  • Description: A zero-shot voice cloning and TTS model capable of generating speech from input text or using a given voice prompt.
  • Demo: Details and demo

Publisher

avatar for WTAI
WTAI

2025/02/08

Categories

Newsletter

Subscribe online

Subscribe to our newsletter for the latest news and updates