Spaces:
Running
title: Ultimate TTS Studio - 900+ Premium Voices
emoji: ๐๏ธ
colorFrom: blue
colorTo: purple
sdk: static
pinned: true
license: apache-2.0
๐๏ธ Ultimate TTS Studio
900+ Premium Voices from 3 World-Class TTS Engines - All Running in Your Browser!
โจ Features
๐ฏ 3 Premium TTS Engines
๐ฏ Piper TTS - 904 voices across 50+ languages
- High-quality multilingual support
- Multiple quality levels (High/Medium/Low)
- 3-5x realtime generation speed
โจ Kokoro TTS - 21 expressive voices (Highest Quality)
- 24kHz studio-quality audio
- American & British accents
- Most natural & expressive
โก Kitten TTS - 8 voices (Fastest)
- Only 24MB model size
- Lightning-fast generation
- Perfect for quick tasks
๐ Key Capabilities
- โ 900+ Professional Voices - Choose from massive variety
- โ 50+ Languages - Speak in any language with Piper
- โ Unlimited Text Length - Automatic smart chunking
- โ WebGPU Acceleration - Hardware-accelerated when available
- โ Zero Server Cost - 100% client-side processing
- โ Offline Capable - Works after models cached
- โ Privacy First - No data leaves your browser
- โ Professional Quality - Up to 24kHz audio output
๐ฎ How to Use
1. Select Your Engine
For Maximum Variety: Choose Piper TTS
- 904 voices across 50+ languages
- Select quality level (High/Medium/Low)
- Pick language and accent
For Best Quality: Choose Kokoro TTS
- 21 expressive voices
- Studio-quality 24kHz audio
- Perfect for audiobooks & narration
For Speed: Choose Kitten TTS
- 8 fast voices
- Lightweight model (24MB)
- Quick generation
2. Configure Voice
Piper Options:
- Quality: High (22kHz) / Medium (16kHz) / Low (Fast)
- Languages: English (US/GB), Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, + 40 more!
- Top Voices: Lessac, Ryan (US) | Cori, Alan (GB)
Kokoro Options:
- American: Bella, Nicole, Sarah, Sky, Adam, Michael
- British: Emma, Isabella, George, Lewis
Kitten Options:
- 8 voices (Voice 0-7) with different characteristics
3. Enter Text & Generate
- Type or paste your text (unlimited length)
- Adjust speed if needed (0.5x - 2.0x)
- Click "๐ค Generate Speech"
- Wait for generation (watch progress bar)
- Play audio or download as WAV
๐ Supported Languages
Piper TTS - 50+ Languages:
Major Languages:
- ๐บ๐ธ English (US) - 20+ voices
- ๐ฌ๐ง English (UK) - 15+ voices
- ๐ช๐ธ Spanish - 30+ voices
- ๐ซ๐ท French - 25+ voices
- ๐ฉ๐ช German - 20+ voices
- ๐ฎ๐น Italian - 15+ voices
- ๐ต๐น Portuguese - 10+ voices
- ๐จ๐ณ Chinese - 10+ voices
- ๐ฏ๐ต Japanese - 5+ voices
- ๐ฐ๐ท Korean - 5+ voices
Plus: Dutch, Russian, Polish, Turkish, Arabic, Hindi, Vietnamese, Thai, and many more!
๐ Engine Comparison
| Feature | Piper | Kokoro | Kitten |
|---|---|---|---|
| Voices | 904 | 21 | 8 |
| Quality | โญโญโญโญ | โญโญโญโญโญ | โญโญโญ |
| Speed | Medium | Medium | Fast |
| Model Size | ~50MB | ~80MB | ~24MB |
| Languages | 50+ | English | English |
| Sample Rate | 16-22kHz | 24kHz | 16kHz |
| Best For | Variety | Quality | Speed |
๐ฏ Use Cases
Content Creation
- ๐ฌ Video voiceovers & narration
- ๐ Audiobook production
- ๐๏ธ Podcast intros/outros
- ๐บ YouTube tutorials
Accessibility
- ๐๏ธ Screen reader alternatives
- ๐ Reading assistance
- ๐ Language learning
- ๐ฑ Audio content for visually impaired
Development
- ๐ค Voice UI prototyping
- ๐ฎ Game character voices
- ๐ IVR system testing
- ๐ฌ Chatbot voice responses
๐ง Technical Details
Technology Stack
- Frontend: Pure HTML5 + JavaScript (ES6+)
- TTS Library: onnx-tts-web
- Runtime: ONNX Runtime Web
- Acceleration: WebGPU / WebAssembly
- Audio: Web Audio API
Model Sources
- Piper: rhasspy/piper-voices
- Kokoro: therealtimex/kokoro-tts-web
- Kitten: therealtimex/kitten-tts-web
Browser Requirements
- Minimum: Chrome 90+ / Firefox 88+ / Safari 14+ / Edge 90+
- Recommended: Latest Chrome/Edge with WebGPU enabled
- Features Required: WebAssembly, Web Audio API
- Optional: WebGPU for acceleration
Performance
- Model Loading: 5-15 seconds (first time only, then cached)
- Generation Speed: 2-5 seconds per 200 characters
- Real-time Factor: 3-10x (depending on hardware & engine)
- Memory Usage: ~200-500MB (with models loaded)
๐ก Performance Tips
For Best Quality:
- Use Kokoro TTS for English content
- Select High Quality in Piper settings
- Use well-punctuated text
- Keep sentences moderate length
For Best Speed:
- Use Kitten TTS for quick tasks
- Select Low Quality in Piper
- Enable WebGPU in browser settings
- Use shorter text inputs
For Most Options:
- Use Piper TTS for language variety
- Explore different accents/regions
- Compare quality levels
- Try multiple voices for same language
๐ฌ Quick Start Examples
Example 1: Professional Audiobook
Engine: Kokoro TTS
Voice: Bella (American Female)
Speed: 0.95x
Quality: 24kHz
Text: Your book chapter...
Example 2: Tutorial Narration
Engine: Piper TTS
Voice: Lessac (US, High Quality)
Speed: 1.0x
Quality: 22kHz
Text: Your tutorial script...
Example 3: Quick Announcement
Engine: Kitten TTS
Voice: Voice 4 (Clear)
Speed: 1.1x
Text: Your announcement...
Example 4: Spanish Content
Engine: Piper TTS
Voice: es_ES (Spain Spanish)
Speed: 1.0x
Quality: High
Text: Su texto en espaรฑol...
๐ Troubleshooting
Model Loading Issues
Problem: "ERROR initializing" message
Solutions:
- Check internet connection
- Wait for download to complete
- Try different quality level
- Clear browser cache
- Refresh page
No Audio Output
Problem: Player appears but no sound
Solutions:
- Check browser audio permissions
- Verify volume settings
- Try different voice/engine
- Check browser console (F12)
- Test with different browser
Slow Performance
Problem: Generation takes too long
Solutions:
- Switch to Kitten TTS for speed
- Lower quality in Piper settings
- Enable WebGPU (
chrome://flags) - Update browser to latest version
- Close other tabs/applications
WebGPU Not Available
Problem: Shows "WASM" instead of "WebGPU"
Solutions:
- Update browser to latest version
- Enable in
chrome://flagsโ "WebGPU" - Check GPU driver updates
- WebGPU optional, WASM works fine
๐ฏ Voice Recommendations
English (US) - Natural:
- Lessac (Piper) - Professional, clear
- Ryan (Piper) - Authoritative, deep
- Bella (Kokoro) - Elegant, sophisticated
English (GB) - British:
- Cori (Piper) - Refined, professional
- Emma (Kokoro) - Elegant, polished
- George (Kokoro) - Commanding, distinguished
Spanish:
- es_ES (Piper) - Spain Spanish, multiple voices
- es_MX (Piper) - Mexican Spanish
French:
- fr_FR (Piper) - France French, multiple voices
German:
- de_DE (Piper) - German, multiple voices
๐ Privacy & Security
โ 100% Client-Side - All processing in your browser โ No Server Upload - Text never leaves your device โ No Data Collection - Zero analytics or tracking โ No Account Required - Use instantly, no signup โ Offline Capable - Works without internet (after cache)
๐ License & Credits
License
This project is released under the Apache 2.0 License.
Credits & Acknowledgments
Libraries & Tools:
- onnx-tts-web by @therealtimex
- Piper TTS by Rhasspy
- ONNX Runtime by Microsoft
Models:
- Piper TTS models by Rhasspy team
- Kokoro TTS by community contributors
- Kitten TTS by community contributors
Inspiration:
- TTS Studio by @clowerweb
๐ Future Enhancements
Planned features:
- More TTS engines (Coqui, VITS)
- Voice cloning with SpeechT5
- SSML markup support
- Batch processing
- MP3/OGG export
- Voice mixing/blending
- Real-time streaming
- Pronunciation dictionary
๐ค Contributing
Found a bug or have a suggestion? Please open an issue or submit a pull request!
๐ Star This Space!
If you find this useful, please give it a โญ star on HuggingFace!
Made with โค๏ธ for the open-source community
Enjoy creating amazing voice content! ๐๏ธ