Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube transcription not possible: Bot protection #112

Open
liferadioat opened this issue Jul 24, 2024 · 4 comments
Open

YouTube transcription not possible: Bot protection #112

liferadioat opened this issue Jul 24, 2024 · 4 comments

Comments

@liferadioat
Copy link

Description

I am trying to transcribe a youtube video by url, but it fails.

Environment

  • OS: Ubuntu
  • Browser: Chrome
  • Hosting: Webserver

Logs and Configuration

Docker Compose Logs

Run the following command in the project folder, force the error, and paste the logs below: docker compose logs -f --tail 50

12:27PM ERR Error downloading media error="[youtube] KUdcTGQvhKI: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"
12:27PM ERR Error transcribing error="[youtube] KUdcTGQvhKI: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"

Docker Compose File

version: "3.9"

services:
  mongo:
    image: mongo
    env_file:
      - .env
    restart: unless-stopped
    volumes:
      - ./whishper_data/db_data:/data/db
      - ./whishper_data/db_data/logs/:/var/log/mongodb/
    environment:
      MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
      MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
    expose:
      - 27017
    command: ['--logpath', '/var/log/mongodb/mongod.log']

  translate:
    container_name: whisper-libretranslate
    image: libretranslate/libretranslate:latest-cuda
    restart: unless-stopped
    volumes:
      - ./whishper_data/libretranslate/data:/home/libretranslate/.local/share
      - ./whishper_data/libretranslate/cache:/home/libretranslate/.local/cache
    env_file:
      - .env
    user: root
    tty: true
    environment:
      LT_DISABLE_WEB_UI: True
      LT_LOAD_ONLY: ${LT_LOAD_ONLY:-en,fr,es}
      LT_UPDATE_MODELS: True
    expose:
      - 5000
    networks:
      default:
        aliases:
          - translate
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

  whishper:
    pull_policy: always
    image: pluja/whishper:${WHISHPER_VERSION:-latest-gpu}
    env_file:
      - .env
    volumes:
      - ./whishper_data/uploads:/app/uploads
      - ./whishper_data/logs:/var/log/whishper
    container_name: whishper
    restart: unless-stopped
    networks:
      default:
        aliases:
          - whishper
    ports:
      - 8082:80
    depends_on:
      - mongo
      - translate
    environment:
      PUBLIC_INTERNAL_API_HOST: "http://127.0.0.1:80"
      PUBLIC_TRANSLATION_API_HOST: ""
      PUBLIC_API_HOST: ${WHISHPER_HOST:-}
      PUBLIC_WHISHPER_PROFILE: gpu
      WHISPER_MODELS_DIR: /app/models
      UPLOAD_DIR: /app/uploads
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]
@liferadioat liferadioat changed the title YouTube Transcribe not possible: Bot protection YouTube transcription not possible: Bot protection Jul 24, 2024
@abdessalaam
Copy link

I got this error too. Have you found a solution?
"[youtube] BNIVH4cnT58: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"

@liferadioat
Copy link
Author

Unfortunatelly not - waiting for reaction of the maintainer :)

@Stinosko
Copy link

This issue is the result of Google preventing the download of video's by third parties and has nothing to do with this project. You can read the discussion on the yt-dlp project here.

Short answer is to find a VPN or proxy to circumvent this issue and limit the downloads you make. So you don't trigger the detecting on Google's servers.

@Steltek
Copy link

Steltek commented Sep 16, 2024

Possibly related: It looks like yt-dlp doesn't get updated in the 'whishper' container. Over time, that causes yt-dlp to become really outdated. Run this within the container to update it manually:

yt-dlp -U

Maybe this could be added to the container entrypoint so that it auto-updates on startup?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants