Skip to content

Latest commit

 

History

History
80 lines (54 loc) · 2.29 KB

README.md

File metadata and controls

80 lines (54 loc) · 2.29 KB

Release Build Licence

tgisper

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. For more details: github.com/openai/whisper

faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. For more details: github.com/guillaumekln/faster-whisper

Tgisper is a bot for Telegram using a model from OpenAI to convert voice messages to text. It is enough to record a voice message or send it to the bot from another chat and you're done!

Usage

docker run -d \
-e ASR_MODEL=small \
-e BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm \
-e OMP_NUM_THREADS=2 \
ghcr.io/ckaytev/tgisper:main

Setup and run (Development Environment)

Install command-line tool ffmpeg:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

Install poetry with following command:

pip3 install poetry

Install packages:

poetry install

Set environment variable:

export BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm

# The list of available models (https://github.com/openai/whisper/#available-models-and-languages)
export ASR_MODEL=base 

# When running on CPU, make sure to set the same number of threads
export OMP_NUM_THREADS=2

Starting the bot polling:

poetry run tgisper

With docker compose:

docker compose run -d -e BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm tgisper