AI Tools.

Search

text to speech models

13 models · ranked by HuggingFace downloads

Kokoro-82M

Kokoro-82M is a compact 82-million-parameter text-to-speech model fine-tuned from StyleTTS2, targeting natural-sounding English speech synthesis at a size runnable on CPU or modest GPU. Released under Apache 2.0 with a HuggingFace DOI, it gained attention as a high-quality open TTS model at significantly smaller scale than most alternatives. It supports multiple English voice styles.

9,521,975 ↓ · 6,102 ♡

XTTS-v2

XTTS-v2 is Coqui's multilingual text-to-speech model supporting 17 languages with voice cloning from a short audio sample. It uses a GPT-style decoder for speech token generation, enabling zero-shot speaker cloning without fine-tuning. The model was released before Coqui's closure and remains available under a non-standard license.

7,667,209 ↓ · 3,522 ♡

OmniVoice

OmniVoice is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

2,187,329 ↓ · 776 ♡

chatterbox

chatterbox is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

2,169,952 ↓ · 1,573 ♡

Qwen3-TTS-12Hz-1.7B-CustomVoice

Qwen3-TTS-12Hz-1.7B-CustomVoice is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

1,747,153 ↓ · 1,449 ♡

VibeVoice-Realtime-0.5B

VibeVoice-Realtime-0.5B is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

1,114,904 ↓ · 1,218 ♡

indic-parler-tts

indic-parler-tts is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

783,337 ↓ · 222 ♡

Qwen3-TTS-12Hz-0.6B-Base

Qwen3-TTS-12Hz-0.6B-Base is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

662,564 ↓ · 234 ♡

F5-TTS

F5-TTS is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

566,271 ↓ · 1,166 ♡

Kokoro-82M-bf16

Kokoro-82M-bf16 is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

519,644 ↓ · 50 ♡

Qwen3-TTS-12Hz-1.7B-VoiceDesign

Qwen3-TTS-12Hz-1.7B-VoiceDesign is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

483,739 ↓ · 336 ♡

mms-tts-hat

mms-tts-hat is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

442,677 ↓ · 4 ♡

Qwen3-TTS-12Hz-0.6B-CustomVoice

Qwen3-TTS-12Hz-0.6B-CustomVoice is an open-source text-to-speech model available on HuggingFace. Details are sourced from the public model registry.

297,476 ↓ · 142 ♡