Installation¶

Requirements¶

Python 3.10+
FFmpeg — for audio processing
OpenClaw running locally
GPU with CUDA recommended for STT and local TTS, but not required

Clone & Install¶

Linux / macOSWindows

git clone https://github.com/ekleziast/kiwi-voice.git
cd kiwi-voice

python -m venv venv
source venv/bin/activate

pip install -r requirements.txt

git clone https://github.com/ekleziast/kiwi-voice.git
cd kiwi-voice

python -m venv venv
venv\Scripts\activate

pip install -r requirements.txt

FFmpeg¶

LinuxmacOSWindows

sudo apt install ffmpeg

brew install ffmpeg

Download from ffmpeg.org and add to PATH, or set KIWI_FFMPEG_PATH in .env.

Environment File¶

cp .env.example .env

Edit .env with your API keys. All keys are optional — Kiwi works with free local providers out of the box:

# Optional: ElevenLabs for cloud TTS
KIWI_ELEVENLABS_API_KEY=sk-...

# Optional: Telegram bot for voice security approvals
KIWI_TELEGRAM_BOT_TOKEN=123456:ABC...
KIWI_TELEGRAM_CHAT_ID=123456789

# Optional: RunPod for Qwen3-TTS serverless
RUNPOD_API_KEY=...
RUNPOD_TTS_ENDPOINT_ID=...

Zero-cost setup

Kiwi works fully local and free with Kokoro ONNX (TTS) + Faster Whisper (STT) + text wake word engine. No API keys needed.

Next Steps¶

Configuration — customize config.yaml
First Run — start Kiwi and register your voice