- Can I apply pretrained pipelines on audio already loaded in memory?
- Can I use gated models (and pipelines) offline?
- Does pyannote support streaming speaker diarization?
- How can I improve performance?
- How does one spell and pronounce pyannote.audio?
Yes: read this tutorial until the end.
Short answer: yes, see this tutorial for models and that one for pipelines.
Long answer: gating models and pipelines allows me to know a bit more about pyannote.audio
user base and eventually help me write grant proposals to make pyannote.audio
even better. So, please fill gating forms as precisely as possible.
For instance, before gating pyannote/speaker-diarization
, I had no idea that so many people were relying on it in production. Hint: sponsors are more than welcome! Maintaining open source libraries is time consuming.
That being said, this whole authentication process does not prevent you from using official pyannote.audio
models offline (i.e. without going through the authentication process in every docker run ...
or whatever you are using in production): see this tutorial for models and that one for pipelines.
pyannote does not, but diart (which is based on pyannote) does.
Short answer: pyannoteAI precision models are usually much more accurate (and faster).
Long answer:
- Manually annotate dozens of conversations as precisely as possible.
- Separate them into train (80%), development (10%) and test (10%) subsets.
- Setup the data for use with
pyannote.database
. - Follow this recipe.
- Enjoy.
📝 Written in lower case: pyannote.audio
(or pyannote
if you are lazy). Not PyAnnote
nor PyAnnotate
(sic).
📢 Pronounced like the french verb pianoter
. pi
like in pi
ano, not py
like in py
thon.
🎹 pianoter
means to play the piano (hence the logo 🤯).