Skip to content

TTS Providers

Kiwi supports five text-to-speech providers. Switch between them in config.yaml or via environment variable.

Comparison

Provider Quality Latency Cost Local GPU Languages
ElevenLabs Excellent ~0.3s ~$0.30/1K chars No 29
Qwen3-TTS (local) Excellent ~1–3s Free Yes (CUDA) Many
Qwen3-TTS (RunPod) Excellent ~2–5s ~$0.0003/sec No Many
Kokoro ONNX High <0.5s Free No 8
Piper Good <0.5s Free No 30+

Kokoro ONNX

FREE RECOMMENDED

Fully local TTS with 14 voices at 24kHz. Models auto-download on first use (~340MB). No GPU needed.

tts:
  provider: "kokoro"
  kokoro:
    voice: "af_heart"
    speed: 1.0

Supports English, Japanese, Chinese, Korean, and several European languages. Russian is not yet supported — use Piper or ElevenLabs for Russian.

Piper

FREE

Fast local TTS using ONNX models. Wide language support including Russian.

tts:
  provider: "piper"
  piper:
    model: "en_US-lessac-medium"

Models are downloaded automatically. See the Piper voices list for available models.

ElevenLabs

PAID

Cloud-based TTS with the lowest latency and highest voice quality. Requires an API key.

tts:
  provider: "elevenlabs"
  elevenlabs:
    voice_id: "aEO01A4wXwd1O8GPgGlF"
    model_id: "eleven_multilingual_v2"
    stability: 0.45
    similarity_boost: 0.75
    speed: 1.0
# .env
KIWI_ELEVENLABS_API_KEY=sk-...

Qwen3-TTS (Local)

FREE GPU REQUIRED

High-quality local TTS using the Qwen3-TTS model. Requires a CUDA GPU with sufficient VRAM.

tts:
  provider: "qwen3"
  qwen3:
    backend: "local"

Qwen3-TTS (RunPod)

PAY-PER-USE

Same Qwen3-TTS quality, but running on RunPod serverless GPUs. No local GPU needed.

tts:
  provider: "qwen3"
  qwen3:
    backend: "runpod"
# .env
RUNPOD_API_KEY=...
RUNPOD_TTS_ENDPOINT_ID=...

Switching Providers

Via config.yaml

tts:
  provider: "kokoro"   # kokoro, piper, qwen3, elevenlabs

Via environment variable

KIWI_TTS_PROVIDER=kokoro python -m kiwi

Via REST API

Test TTS without changing config:

curl -X POST http://localhost:7789/api/tts/test \
  -d '{"text": "Hello, I am Kiwi!"}'