Skip to content

Latest commit

 

History

History
75 lines (51 loc) · 2.15 KB

README.md

File metadata and controls

75 lines (51 loc) · 2.15 KB

Whisper on Fly GPUs

Run OpenAI Whisper as a Replicate Cog on Fly.io!

cog

This app exposes the Whisper model via a simple HTTP server, thanks to Replicate Cog. Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container. When you're up and running, you can trascribe audio using the /predictions endpoint.

Launch

Create a deploy the app in one single command:

fly launch --from https://github.com/fly-apps/cog-whisper --no-public-ips

Assign a Flycast IP to the app:

fly ips allocate-v6 --private

That's it! You can now access the app at http://<APP_NAME>.flycast/predictions

Important

By default, the app runs on Fly GPUs — Nvidia L40s to be exact. This can be customized in the fly.toml vm settings. It will run on a standard Fly Machine — but performance will be reduced.

Usage

curl -X PUT \
     -H "Content-Type: application/json" \
     -d '{
           "input": {
             "audio": "https://fly.storage.tigris.dev/cogs/bun_on_fly.mp3"
           }
         }' \
     http://cog-whisper.flycast/predictions/test | jq

Local Setup

  1. Clone the cog-whisper repository from GitHub:

    git clone [email protected]:fly-apps/cog-whisper.git
  2. Navigate into the cloned directory:

    cd cog-whisper
  3. Run locally. First, run get_weights.sh from the project root to download pre-trained weights, then build a container and run predictions:

    ./scripts/get_weights.sh:
    cog predict -i audio="<path/to/your/audio/file>"
  4. Build the Docker image using cog:

    cog build -t whisper

Having trouble?

Create an issue or ask a question here: https://community.fly.io/