Whisper ASR

Build a prototype for automatic speech recognition (ASR) service using open sourced Whisper.

Prerequisites

Python >= 3.8
ffmpeg
PortAudio

Installation on MacOS using Homebrew

brew install ffmpeg
brew install portaudio

Create Virtual Environment

python -m venv venv

Activate Virtual Environment

source venv/bin/activate

Installation

pip install -r requirements.txt

Run

To run the prototype, first the server then the client need to be started.

Server

The server opens a websocket to receive an audio stream. Caches the data and does the transcription or translation using whisper.

python streaming_server.py
# Or docker
docker run -p 8765:8765 lingualogic/whisper-asr:0.2.1
# with GPU
docker run --gpus=all -p 8765:8765 lingualogic/whisper-asr:0.2.1

Client

The client opens an microphone and send the audio stream via websocket. It is capable of detecting the end of speech and transmits this to the server in order to receive the result.

python streaming_client.py
# set translate task
python streaming_client.py --task translate

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements-docker.txt		requirements-docker.txt
requirements.txt		requirements.txt
streaming_client.py		streaming_client.py
streaming_server.py		streaming_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper ASR

Prerequisites

Installation on MacOS using Homebrew

Create Virtual Environment

Activate Virtual Environment

Installation

Run

Server

Client

About

Releases

Packages

Languages

lingualogic/whisper-asr

Folders and files

Latest commit

History

Repository files navigation

Whisper ASR

Prerequisites

Installation on MacOS using Homebrew

Create Virtual Environment

Activate Virtual Environment

Installation

Run

Server

Client

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages