Audio representation learning with JEPAs

This repository contains the PyTorch code associated to the paper Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning, presented at the SASB workshop at ICASSP 2024.

Usage

Clone the repository and install the requirements using the provided requirements.txt or environment.yml.
Then, preprocess your dataset to convert audios into mel-spectrograms:
```
python wav_to_lms.py /your/local/audioset /your/local/audioset_lms
```

Write the list of files to use as training data in a csv file

cd data
echo file_name > files_audioset.csv
find /your/local/audioset_lms -name "*.npy" >> files_audioset.csv

You can now start training! We rely on Dora for experiment scheduling. For start an experiment locally, just type:
```
dora run
```
Under the hood, Hydra is used for handle configurations, so you can override configurations via CLI or build your own YAML config files. For example, type:
```
dora run data=my_dataset model.encoder.embed_dim=1024
```
to train our model with a larger encoder on your custom dataset.

Moreover, you can seamlessly launch SLURM jobs on a cluster thanks to Dora:
```
dora launch -p partition-a100 -g 4 data=my_dataset
```
We refer to the respective documentations of Hydra and Dora for more advanced usage.

Performances

Our model is evaluated on 8 various downstream tasks, including environmental, speech and music classification ones. Please refer to our paper for additional details.

Checkpoints

Will be available soon...

Credits

This great Lightning+Hydra template
EVAR for evaluating our representations

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
configs		configs
evar		evar
images		images
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio representation learning with JEPAs

Usage

Performances

Checkpoints

Credits

About

Releases

Packages

Languages

SonyCSLParis/audio-representations

Folders and files

Latest commit

History

Repository files navigation

Audio representation learning with JEPAs

Usage

Performances

Checkpoints

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages