Skip to main content
Version: v4.4.0

TTS Providers

OpenReader routes all TTS requests through the Next.js server. Provider credentials are always admin-managed and never accepted from the browser.

Admin-managed shared providers (Settings > Admin > Shared providers): DB-backed instances configured by an admin and visible to all users. Keys are encrypted at rest and never exposed to the client. Available only when auth is enabled and your account is in ADMIN_EMAILS. See Admin Panel.

Per-user Settings modal (Settings > TTS Provider): users may select an enabled shared provider, model, instructions, and voice. Credentials remain server-side.

Environment variables: API_KEY and API_BASE exist as a one-shot first-boot seed that auto-creates a default-openai admin shared provider. After the first boot they are no longer read by the running app.

Providers

  • OpenAI: Cloud. Base URL pre-filled (https://api.openai.com/v1). API key required.
  • Replicate: Cloud. Base URL managed internally by OpenReader. API key required.
  • DeepInfra: Cloud. Base URL pre-filled (https://api.deepinfra.com/v1/openai). API key required.
  • Speech SDK: Cloud. Reaches additional providers (ElevenLabs, Cartesia, Hume, Deepgram, Google, Inworld, and more) directly with your own provider API keys via speech-sdk. No base URL. API key required (the key for the model's provider).
  • Custom OpenAI-Like: Self-hosted or any custom endpoint. API_BASE must be set manually (typically ending in /v1). API key optional.

Admins configure the required fields for each provider type on its shared-provider row (e.g. API key and, where applicable, base URL).

Built-in model catalogs

  • Replicate models: alphanumericuser/kokoro-82m, google/gemini-3.1-flash-tts, minimax/speech-2.8-turbo, qwen/qwen3-tts, inworld/tts-1.5-mini (or choose Other and enter any Replicate model ID, such as owner/model or owner/model:version)
  • OpenAI models: tts-1, tts-1-hd, gpt-4o-mini-tts
  • DeepInfra models: includes hexgrad/Kokoro-82M and additional hosted models (depending on API key / feature flags)
  • Speech SDK models: openai/gpt-4o-mini-tts, elevenlabs/eleven_multilingual_v2, cartesia/sonic-3.5, deepgram/aura-2, google/gemini-2.5-flash-preview-tts, inworld/inworld-tts-1.5-max (or choose Other and enter any provider/model the SDK supports)

Custom provider requirements

Self-hosted or custom providers only need an OpenAI-compatible speech endpoint:

  • POST /v1/audio/speechrequired.
  • Voice listing is optional and auto-discovered: OpenReader probes /v1/audio/voices, /v1/voices, then /v1/styles. If none respond, it falls back to default voices — the Kokoro set for Kokoro models, otherwise the standard OpenAI voices (alloy, echo, fable, onyx, nova, shimmer).

The speech endpoint may return any common audio format — mp3, wav, ogg, or flac. OpenReader detects the format and transcodes non-mp3 audio to mp3 automatically, so your server does not need to honor response_format: mp3. An API key is optional; keyless servers work.

TTS requests are server-side

TTS requests originate from the Next.js server, not the browser. The base URL must be reachable from the server runtime. In Docker, use http://host.docker.internal:<port>/v1 so OpenReader reaches the service's published port on your host. A container name only resolves if OpenReader and the TTS service share a Docker network (Docker Compose, --link, or a shared --network). localhost/127.0.0.1 will not work, since inside the container that points at the container itself.

Provider guides