3MAS

This code is associated with the article "3MAS: A MULTITASK, MULTILABEL, MULTIDATASET AUDIO SEGMENTATION MODEL"

This work has been done during the 2023 JSALT Workshop that took place in Le Mans.

Collaborators

This work is a collaboration between Alexis Plaquet (IRIT), Pablo Gimeno (Unizar), Martin Lebourdais (LIUM)

Getting Started

Prerequisites

To run the code, you will need:

Python 3.9
pyannote.audio, preferably latest commit in develop (latest verified to work: f393546)

Any additional libraries specified in requirements.txt[coming soon, mainly torchmetrics]

Installation

Clone this repository.
Install the required packages using pip install -r requirements.txt into your prefered environment manager

Usage

This repository heavily rely on pyannote for the structure, those used to work with it shouldn't be fazed by the pipeline.

Training

Prepare a Pyannote Database with your favorite datasets
Fill a path.py file containing

PATH_TO_PYANNOTE_DB = "PATH_TO_YOUR_database.yaml"
PATH_TO_NOISE = "PATH_TO_THE_NOISE_AUG_DIR"
PATH_TO_MUSIC = "PATH_TO_THE_NOISE_MUS_DIR"
PATH_TO_DATA_HUB = "PATH_TO_THE_MAIN_DATA_DIR"

Use the script train_full.py

Example :

Train the 4 classes system on a data protocol named X.Segmentation.Main, the default model is a WavLM + TCN, the best window duration observed is 4 seconds

python3 train_full.py --model_typ tcn --dataset X.Segmentation.Main --duration 4.0 --name dummy_output_model_name

If you want to only train an overlapped speech detector, for example on DIHARD III corpus.

python3 train_full.py --model_typ tcn --dataset DIHARD.SpeakerDiarization.Full --duration 4.0 --name dummy_output_model_name_overlap --classes ov

Predict

Use the script predict_full.py

To evaluate the model dummy_output_model_name on the test partition of X.Segmentation.Main

python3 predict_full.py --dataset X.Segmentation.Main --name output_results dummy_output_model_name.ckpt

NMF

This code is also associated with the paper "Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing", by Martin Lebourdais, Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega

The probes and visualisation notebooks used for the article are available in the NMF_visualisation folder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

3MAS

Collaborators

Getting Started

Prerequisites

Installation

Usage

Training

Predict

NMF

Files

README.md

Latest commit

History

README.md

File metadata and controls

3MAS

Collaborators

Getting Started

Prerequisites

Installation

Usage

Training

Predict

NMF