Skip to content

Latest commit

 

History

History
29 lines (25 loc) · 4.1 KB

README.md

File metadata and controls

29 lines (25 loc) · 4.1 KB

Awesome VAD

A curated list of awesome voice activity detection

Repo List

  • wiseman/py-webrtcvad : Python interface to the WebRTC Voice Activity Detector
  • mwv/vad : This is a straight-forward re-implementation of Bowon Lee’s Voice Activity Detector.
  • halleytl/pyvad : VAD(Voice Activity Detector) python 实现对时时读入的流式数据进行端点检测
  • jtkim-kaist/VAD : This toolkit provides the voice activity detection (VAD) code and our recorded dataset.
  • snakers4/silero-vad : Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
  • marsbroshok/VAD-python : Voice Activity Detector in Python
  • hcmlab/vadnet : Real-time Voice Activity Detection in Noisy Environments using Deep Neural Networks
  • eesungkim/Voice_Activity_Detector : A statistical model-based Voice Activity Detection
  • jymsuper/VAD_tutorial : Simple DNN based Voice Activity Detection (VAD) using Pytorch
  • mounalab/LSTM-RNN-VAD : Voice Activity Detection LSTM-RNN learning model
  • RicherMans/GPV : Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
  • MaigoAkisame/cmu-thesis : Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling
  • amsehili/auditok : An audio/acoustic activity detection and audio segmentation tool
  • athena-team/athena-signal : Athena-signal is an open-source implementation of speech signal processing algorithms. It aims to help researchers and engineers who want to use speech signal processing algorithms in their own projects. Athena-signal is mainly implemented using C, and called by python.
  • SIP-Lab/CNN-VAD : A Convolutional Neural Network based Voice Activity Detector for Smartphones
  • filippogiruzzi/voice_activity_detection : Voice Activity Detection based on Deep Learning & TensorFlow
  • nicklashansen/voice-activity-detection : Voice Activity Detection (VAD) using deep learning.
  • marianne-m/brouhaha-vad : Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation (2023)
  • Picovoice/cobra : On-device voice activity detection (VAD) powered by deep learning.
  • iic/speech_fsmn_vad_zh-cn-16k-common-pytorch: Deep-FSMN for large vocabulary continuous speech recognition. FSMN-Monophone VAD是达摩院语音团队提出的高效语音端点检测模型,用于检测输入音频中有效语音的起止时间点信息,并将检测出来的有效音频片段输入识别引擎进行识别,减少无效语音带来的识别错误。
  • Revai/reverb-diarization-pipeline-v2: This repository contains 2 new speaker diarization models built upon the PyAnnote framework. These models are trained and intended for the usage with ASR system (speaker attributed ASR).
  • pyannote/speaker-diarization-3.1: pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it comes with state-of-the-art pretrained models and pipelines, that can be further finetuned to your own data for even better performance.