Kokoro-FastAPI
You can run the Kokoro TTS API server directly with Docker.
warning
For Kokoro issues and support, use the upstream repository: remsky/Kokoro-FastAPI.
Provider
- Provider:
Custom OpenAI-Like - Typical model:
Kokoro API_BASE: required (typically your Kokoro URL ending with/v1)API_KEY: set only if your deployment requires one
Run Kokoro (CPU)
docker run --name kokoro-tts \
--restart unless-stopped \
-d \
-p 8880:8880 \
-e ONNX_NUM_THREADS=8 \
-e ONNX_INTER_OP_THREADS=4 \
-e ONNX_EXECUTION_MODE=parallel \
-e ONNX_OPTIMIZATION_LEVEL=all \
-e ONNX_MEMORY_PATTERN=true \
-e ONNX_ARENA_EXTEND_STRATEGY=kNextPowerOfTwo \
-e API_LOG_LEVEL=DEBUG \
ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4
Run Kokoro (GPU)
docker run --name kokoro-tts \
--restart unless-stopped \
-d \
--gpus all \
--user 1001:1001 \
-p 8880:8880 \
-e USE_GPU=true \
-e PYTHONUNBUFFERED=1 \
-e API_LOG_LEVEL=DEBUG \
ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.4
OpenReader setup
- Start Kokoro using either the CPU or GPU image.
- In OpenReader Settings, choose provider
Custom OpenAI-Like. - Set
API_BASEto your Kokoro endpoint (for Docker Compose, commonlyhttp://kokoro-tts:8880/v1). - Set
API_KEYonly if your deployment requires one. - Choose model
Kokoro.
Notes
Runtime guidance
GPU mode requires NVIDIA Docker support and is best on NVIDIA hardware. CPU mode is a good default on Apple Silicon and modern x86 CPUs.