Faster Whisper Server

faster-whisper-server is an OpenAI API compatible transcription server which uses faster-whisper as it's backend. Features:

GPU and CPU support.
Easily deployable using Docker.
Configurable through environment variables (see config.py).
OpenAI API compatible.

Please create an issue if you find a bug, have a question, or a feature suggestion.

OpenAI API Compatibility ++

See OpenAI API reference for more information.

Audio file transcription via POST /v1/audio/transcriptions endpoint.
- Unlike OpenAI's API, faster-whisper-server also supports streaming transcriptions(and translations). This is usefull for when you want to process large audio files would rather receive the transcription in chunks as they are processed rather than waiting for the whole file to be transcribe. It works in the similar way to chat messages are being when chatting with LLMs.
Audio file translation via POST /v1/audio/translations endpoint.
(WIP) Live audio transcription via WS /v1/audio/transcriptions endpoint.
- LocalAgreement2 (paper | original implementation) algorithm is used for live transcription.
- Only transcription of single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.

Quick Start

Hugging Face Space

Using Docker

docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface fedirz/faster-whisper-server:latest-cuda
# or
docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface fedirz/faster-whisper-server:latest-cpu

Using Docker Compose

curl -sO https://raw.githubusercontent.com/fedirz/faster-whisper-server/master/compose.yaml
docker compose up --detach faster-whisper-server-cuda
# or
docker compose up --detach faster-whisper-server-cpu

Using Kubernetes: tutorial

Usage

If you are looking for a step-by-step walkthrough, checkout this YouTube video.

OpenAI API CLI

export OPENAI_API_KEY="cant-be-empty"
export OPENAI_BASE_URL=http://localhost:8000/v1/

openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text

openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json

OpenAI API Python SDK

from openai import OpenAI

client = OpenAI(api_key="cant-be-empty", base_url="http://localhost:8000/v1/")

audio_file = open("audio.wav", "rb")
transcript = client.audio.transcriptions.create(
    model="Systran/faster-distil-whisper-large-v3", file=audio_file
)
print(transcript.text)

CURL

# If `model` isn't specified, the default model is used
curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]"
curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]"
curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]" -F "stream=true"
curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]" -F "model=Systran/faster-distil-whisper-large-v3"
# It's recommended that you always specify the language as that will reduce the transcription time
curl http://localhost:8000/v1/audio/transcriptions -F "[email protected]" -F "language=en"

curl http://localhost:8000/v1/audio/translations -F "[email protected]"

Live Transcription (using Web Socket)

From live-audio example

demo.mp4

websocat installation is required. Live transcribing audio data from a microphone.

ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - | websocat --binary ws://localhost:8000/v1/audio/transcriptions

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
.github/workflows		.github/workflows
examples		examples
scripts		scripts
src/faster_whisper_server		src/faster_whisper_server
tests		tests
.dockerignore		.dockerignore
.envrc		.envrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.cuda		Dockerfile.cuda
LICENSE		LICENSE
README.md		README.md
Taskfile.yaml		Taskfile.yaml
audio.wav		audio.wav
compose.yaml		compose.yaml
flake.lock		flake.lock
flake.nix		flake.nix
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Faster Whisper Server

OpenAI API Compatibility ++

Quick Start

Usage

OpenAI API CLI

OpenAI API Python SDK

CURL

Live Transcription (using Web Socket)

About

Releases

Packages

Languages

License

sreeprasannar/faster-whisper-server

Folders and files

Latest commit

History

Repository files navigation

Faster Whisper Server

OpenAI API Compatibility ++

Quick Start

Usage

OpenAI API CLI

OpenAI API Python SDK

CURL

Live Transcription (using Web Socket)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages