Skip to content

2.2.7 Backend: openedai speech

av edited this page Sep 14, 2024 · 1 revision

Handle: tts URL: http://localhost:33861/

An OpenAI API compatible text to speech server.

Starting

# [Optional] Pull the tts images
# ahead of starting the service
harbor pull tts

# Sping up Harbor with the TTS instance
harbor up tts

Upon the first start, service will initialise its cache and download the necessary models. You can find both in the tts folder in the Harbor workspace.

Configuration

openedai-speech runs two types of models out of the box - tts-1 (via piper tts, very fast, runs on cpu) and tts-1-hd (via xtts_v2 with voice cloning, fast but requires around ~4Gb of VRAM).

tts-1

You can map your Piper voices via the ./tts/config/voice_to_speaker.yaml file.

Download more voices from the official Piper repo here.

tts-1-hd

xtts_v2 provides you with a voice cloning feature. It can deliver very pleasant and natural sounding voices with appropriate samples. See the official repo guide on how to set up the voice cloning.

You can find more detailed documentation about openedai-speech configuration in the official repository.

Clone this wiki locally