Skip to content

lingualogic/whisper-asr

Repository files navigation

Whisper ASR

Build a prototype for automatic speech recognition (ASR) service using open sourced Whisper.

Prerequisites

Installation on MacOS using Homebrew

brew install ffmpeg
brew install portaudio

Create Virtual Environment

python -m venv venv

Activate Virtual Environment

source venv/bin/activate

Installation

pip install -r requirements.txt

Run

To run the prototype, first the server then the client need to be started.

Server

The server opens a websocket to receive an audio stream. Caches the data and does the transcription or translation using whisper.

python streaming_server.py
# Or docker
docker run -p 8765:8765 lingualogic/whisper-asr:0.2.1
# with GPU
docker run --gpus=all -p 8765:8765 lingualogic/whisper-asr:0.2.1

Client

The client opens an microphone and send the audio stream via websocket. It is capable of detecting the end of speech and transmits this to the server in order to receive the result.

python streaming_client.py
# set translate task
python streaming_client.py --task translate

About

Whisper-ASR Client

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published