yt_scribe.py
is a command-line tool that downloads audio from YouTube videos, transcribes the audio using OpenAI's Whisper model, and exports both the transcription and relevant metadata as files.
- Audio Download: Extracts audio from YouTube videos.
- Automatic Transcription: Transcribes audio using Whisper with GPU/CPU support.
- Language Auto-Detection: Detects language automatically if not specified.
- Metadata Export: Saves video metadata (title, channel, publish date, and detected language) to a JSON file.
- Customizable Output: Configurable model size, language, and output directory.
- Python 3.8+
- Libraries:
torch
whisper
yt-dlp
argparse
json
- Download the script.
- Install dependencies:
pip install torch whisper yt-dlp
python yt_scribe.py -u "<YouTube_URL>" -o <output_directory>
Argument | Description | Default |
---|---|---|
-u , --urls |
Comma-separated YouTube URLs or file path | Required |
-o , --output_dir |
Output directory | Current directory |
-m , --model_size |
Whisper model size (tiny , base , small , medium , large ) |
base |
-l , --language |
Language code (e.g., en , es ) or auto-detection |
Auto-detect |
python yt_scribe.py -u "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o transcriptions/ -m base -l en
python yt_scribe.py -u youtube_urls.txt -o transcriptions/
- Transcription File:
<video_title>_transcription.txt
- Metadata File:
<video_title>_metadata.json
This project is licensed under the MIT License. See the LICENSE
file for details.
For any questions, reach out at my mail.