3MAS

This code is associated with the article "3MAS: A MULTITASK, MULTILABEL, MULTIDATASET AUDIO SEGMENTATION MODEL"

This work has been done during the 2023 JSALT Workshop that took place in Le Mans.

Collaborators

This work is a collaboration between Alexis Plaquet (IRIT), Pablo Gimeno (Unizar), Martin Lebourdais (LIUM)

Getting Started

Prerequisites

To run the code, you will need:

Python 3.9
pyannote.audio, preferably latest commit in develop (latest verified to work: f393546)

Any additional libraries specified in requirements.txt[coming soon, mainly torchmetrics]

Installation

Clone this repository.
Install the required packages using pip install -r requirements.txt into your prefered environment manager

Usage

This repository heavily rely on pyannote for the structure, those used to work with it shouldn't be fazed by the pipeline.

Training

Prepare a Pyannote Database with your favorite datasets
Fill a path.py file containing

PATH_TO_PYANNOTE_DB = "PATH_TO_YOUR_database.yaml"
PATH_TO_NOISE = "PATH_TO_THE_NOISE_AUG_DIR"
PATH_TO_MUSIC = "PATH_TO_THE_NOISE_MUS_DIR"
PATH_TO_DATA_HUB = "PATH_TO_THE_MAIN_DATA_DIR"

Use the script train_full.py

Example :

Train the 4 classes system on a data protocol named X.Segmentation.Main, the default model is a WavLM + TCN, the best window duration observed is 4 seconds

python3 train_full.py --model_typ tcn --dataset X.Segmentation.Main --duration 4.0 --name dummy_output_model_name

If you want to only train an overlapped speech detector, for example on DIHARD III corpus.

python3 train_full.py --model_typ tcn --dataset DIHARD.SpeakerDiarization.Full --duration 4.0 --name dummy_output_model_name_overlap --classes ov

Predict

Use the script predict_full.py

To evaluate the model dummy_output_model_name on the test partition of X.Segmentation.Main

python3 predict_full.py --dataset X.Segmentation.Main --name output_results dummy_output_model_name.ckpt

NMF

This code is also associated with the paper "Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing", by Martin Lebourdais, Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega

The probes and visualisation notebooks used for the article are available in the NMF_visualisation folder

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
NMF_visualisation		NMF_visualisation
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_full.py		eval_full.py
predict_full.py		predict_full.py
requirement.txt		requirement.txt
train_full.py		train_full.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3MAS

Collaborators

Getting Started

Prerequisites

Installation

Usage

Training

Predict

NMF

About

Releases

Packages

Languages

License

Lebourdais/3MAS

Folders and files

Latest commit

History

Repository files navigation

3MAS

Collaborators

Getting Started

Prerequisites

Installation

Usage

Training

Predict

NMF

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages