Getting Started - travisvn/chatterbox-tts-api GitHub Wiki
Welcome to Chatterbox TTS API!
Follow these steps to install, configure, and launch your own local, OpenAI-compatible text-to-speech API.
# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies with uv (automatically creates venv)
uv sync
# Copy and customize environment variables
cp .env.example .env
# Start the API with FastAPI
uv run uvicorn app.main:app --host 0.0.0.0 --port 4123
# Or use the main script
uv run main.py
💡 Why uv? Users report better compatibility with
chatterbox-tts
, 25-40% faster installs, and superior dependency resolution. See migration guide →
# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api
# Setup environment — using Python 3.11
python -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Copy and customize environment variables
cp .env.example .env
# Add your voice sample (or use the provided one)
# cp your-voice.mp3 voice-sample.mp3
# Start the API with FastAPI
uvicorn app.main:app --host 0.0.0.0 --port 4123
# Or use the main script
python main.py
Ran into issues? Check the troubleshooting section
# Clone and start with Docker Compose
git clone https://github.com/travisvn/chatterbox-tts-api
cd chatterbox-tts-api
# Use Docker-optimized environment variables
cp .env.example.docker .env # Docker-specific paths, ready to use
# Or: cp .env.example .env # Local development paths, needs customization
# Choose your deployment method:
# API Only (default)
docker compose -f docker/docker-compose.yml up -d # Standard (pip-based)
docker compose -f docker/docker-compose.uv.yml up -d # uv-optimized (faster builds)
docker compose -f docker/docker-compose.gpu.yml up -d # Standard + GPU
docker compose -f docker/docker-compose.uv.gpu.yml up -d # uv + GPU (recommended for GPU users)
docker compose -f docker/docker-compose.cpu.yml up -d # CPU-only
# API + Frontend (add --profile frontend to any of the above)
docker compose -f docker/docker-compose.yml --profile frontend up -d # Standard + Frontend
docker compose -f docker/docker-compose.gpu.yml --profile frontend up -d # GPU + Frontend
docker compose -f docker/docker-compose.uv.gpu.yml --profile frontend up -d # uv + GPU + Frontend
# Watch the logs as it initializes (the first use of TTS takes the longest)
docker logs chatterbox-tts-api -f
# Test the API
curl -X POST http://localhost:4123/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"input": "Hello from Chatterbox TTS!"}' \
--output test.wav
🚀 Running with the Web UI (Full Stack)
This project includes an optional React-based web UI. Use Docker Compose profiles to easily opt in or out of the frontend:
# API only (default behavior)
docker compose -f docker/docker-compose.yml up -d
# API + Frontend + Web UI (with --profile frontend)
docker compose -f docker/docker-compose.yml --profile frontend up -d
# Or use the convenient helper script for fullstack:
python start.py fullstack
# Same pattern works with all deployment variants:
docker compose -f docker/docker-compose.gpu.yml --profile frontend up -d # GPU + Frontend
docker compose -f docker/docker-compose.uv.yml --profile frontend up -d # uv + Frontend
docker compose -f docker/docker-compose.cpu.yml --profile frontend up -d # CPU + Frontend