Server for running Silero TTS models (or other compatible models) with OpenTTS-like API. This server allows users to generate speech using different models and provides an easy-to-use REST API.
- Generate speech from text using multiple TTS models.
- Support for custom model uploads.
- OpenAPI documentation available.
- Full OpenTTS API compatibility
- 🇩🇪 German (de)
- 🇬🇧 English (en)
- 🇪🇸 Spanish (es)
- 🇫🇷 French (fr)
- 🇮🇳 Indic scripts (indic)
- 🇷🇺 Russian (ru)
- 🇷🇺 Tatar (tt)
- 🇺🇦 Ukrainian (ua)
- 🇺🇿 Uzbek (uz)
- 🇷🇺 Kalmyk (xal)
- 🌐 Other cyrillic languages
- FastAPI
- swagger-ui
- pydub
- PyTorch
/api/tts
Generate speach.
If you use not local model, it will be downloaded automatically.
/api/languages
Get available languages list (From only local models!)
/api/voices
Get available speakers list (From only local models!)
/upload_model/
Upload custom model. Admin auth required!
/remove_file_cache/
Remove cached files. Admin auth required!
/openapi
OpenAPI docs
For authorization used key in HTTP headers: api_key: <64-byte token>
More information you can find at /openapi or swagger.yaml
AGPLv3
To use Silero models you need to accept Silero's license.
podman build -t silero-rest-api-server:latest .
From local image:
podman run -d --name silero-rest-api-server-container -p 5500:5500 -e SILERO_LICENCE=ACCEPTED silero-rest-api-server
From repository:
podman run -d --name silero-rest-api-server-container -p 5500:5500 -e SILERO_LICENCE=ACCEPTED yaroslavk1231/silero-rest-api-server
API KEY
podman logs silero-rest-api-server-container | grep "API KEY"
python3 -m venv venv
source venv/bin/activate
pip3 install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip3 install --no-cache-dir -r requirements.txt
source venv/bin/activate
SILERO_LICENCE="ACCEPTED" uvicorn main:app --host 0.0.0.0 --port 5500
curl -X 'GET' \
'http://127.0.0.1:5500/api/voices?language=en&locale=en-indic' \
-H 'accept: */*'
curl -X 'GET' \
'http://127.0.0.1:5500/api/languages' \
-H 'accept: */*'
curl -X 'GET' 'http://127.0.0.1:5500/api/tts?voice=v3_en%3Aen_64&text=Hello%20World%21' -H 'accept: */*' --output test.wav
Out: Audio file "test.wav"
curl -X POST "http://localhost:5500/upload_model/" -H "Content-Type: multipart/form-data" -H "api_key:<YOUR API KEY>" -F "file=@v3_en.pt"
{
"max_char_length": 600, # Maximum size of text block. If text is larger than this value, it will be divided to blocks.
"sample_rate": 48000, # Sampling rate: 48000 or 24000 or 8000
"threads_limit": 6, # Maximum threads usage for models
"min_model_version": 3, # Minimum model version (Using models below version 3 may cause problems)
"offline_mode": false, # Prohibit server to download new models from models.silero.ai/models/tts
"update_voices": true # Save list of local speakers to speakers.json
}
List of official SileroTTS models. Can be updated automatically.
{
"model_name.pt": "download_url",
...
}
- main.py - api server and main functions
- admin.py - functions for admin auth
- config.py - functions for work with config.json
- tts.py - function for work with models and pytorch
This project incorporates some code from the silero-api-server(MIT).
This app downloads add runs Silero TTS models provided by CC-BY-NC(or other) license.