From ab7c9b01c083813c4dacf3325f6acccf2d075637 Mon Sep 17 00:00:00 2001 From: jhj0517 <97279763+jhj0517@users.noreply.github.com> Date: Wed, 26 Jun 2024 21:57:03 +0900 Subject: [PATCH] Update README.md --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 589542f2..1c2146dc 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,8 @@ If you wish to try this on Colab, you can do it in [here](https://colab.research - Text to Text Translation - Translate subtitle files using Facebook NLLB models - Translate subtitle files using DeepL API +- Speaker diarization with [pyannote](https://huggingface.co/pyannote/speaker-diarization-3.1) model as a post-processing. + - You need Huggingface token and mannually go to https://huggingface.co/pyannote/speaker-diarization-3.1 and accept their requirements to download the model. # Installation and Running ### Prerequisite @@ -107,6 +109,7 @@ This is Whisper's original VRAM usage table for models. - [x] Add NLLB Model translation - [x] Integrate with faster-whisper - [x] Integrate with insanely-fast-whisper -- [ ] Integrate with whisperX +- [x] Integrate with whisperX ( Only speaker diarization part ) +- [ ] Add fast api script