Wordcab Transcribe 💬

FastAPI based API for transcribing audio files using faster-whisper and pyannote-audio

More details on this project on this blog post.

Key features

🤗 Open-source: Our project is open-source and based on open-source libraries, allowing you to customize and extend it as needed.
⚡ Fast: The faster-whisper library and CTranslate2 make audio processing incredibly fast compared to other implementations.
🐳 Easy to deploy: You can deploy the project on your workstation or in the cloud using Docker.
🔥 Batch requests: You can transcribe multiple audio files at once because batch requests are implemented in the API.
💸 Cost-effective: As an open-source solution, you won't have to pay for costly ASR platforms.
🫶 Easy-to-use API: With just a few lines of code, you can use the API to transcribe audio files or even YouTube videos.

Requirements

Linux (tested on Ubuntu Server 22.04)
Python 3.9
Docker
NVIDIA GPU + NVIDIA Container Toolkit

To learn more about the prerequisites to run the API, check out the Prerequisites section of the blog post.

Docker commands

Build the image.

docker build -t wordcab-transcribe:latest .

Run the container.

docker run -d --name wordcab-transcribe \
    --gpus all \
    --shm-size 1g \
    --restart unless-stopped \
    -p 5001:5001 \
    wordcab-transcribe:latest

Test the API

Once the container is running, you can test the API.

The API documentation is available at http://localhost:5001/docs.

Using CURL

Audio file:

curl -X 'POST' \
  'http://localhost:5001/api/v1/audio' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@/path/to/audio/file.wav'

YouTube video:

curl -X 'POST' \
  'http://localhost:5001/api/v1/youtube' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "url": "https://youtu.be/dQw4w9WgXcQ"
}'

Using Python

Audio file:

import requests

filepath = "/path/to/audio/file.wav"  # or mp3
files = {"file": open(filepath, "rb")}
response = requests.post("http://localhost:5001/api/v1/audio", files=files)
print(response.json())

YouTube video:

import requests

url = "https://youtu.be/dQw4w9WgXcQ"
data = {"url": url}
response = requests.post("http://localhost:5001/api/v1/youtube", json=data)
print(response.json())

Local testing

Before launching the API, be sure to install torch and torchaudio on your machine.

pip install --upgrade torch==1.13.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

Then, you can launch the API using the following command.

poetry run uvicorn wordcab_transcribe.main:app --reload

🚀 Contributing

Getting started

Clone the repo

git clone
cd wordcab-ask

Install dependencies and start coding

poetry install
poetry shell

# install pre-commit hooks
nox --session=pre-commit -- install

# open your IDE
code .

Run tests

# run all tests
nox

# run a specific session
nox --session=tests  # run tests
nox --session=pre-commit  # run pre-commit hooks

# run a specific test
nox --session=tests -- -k test_something

Working workflow

Create an issue for the feature or bug you want to work on.
Create a branch using the left panel on GitHub.
git fetchand git checkout the branch.
Make changes and commit.
Push the branch to GitHub.
Create a pull request and ask for review.
Merge the pull request when it's approved and CI passes.
Delete the branch.
Update your local repo with git fetch and git pull.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Wordcab Transcribe 💬

Key features

Requirements

Docker commands

Test the API

Using CURL

Using Python

Local testing

🚀 Contributing

Getting started

Working workflow

Files

README.md

Latest commit

History

README.md

File metadata and controls

Wordcab Transcribe 💬

Key features

Requirements

Docker commands

Test the API

Using CURL

Using Python

Local testing

🚀 Contributing

Getting started

Working workflow