Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021
Project page: ipcv.github.io/Acappella/
Paper: Arxiv, Supplementary Material
BMVC Paper page
We provide simple functions to load models with pre-trained weights. Steps:
- Clone the repo or download y-net>VnBSS>models (models can run as a standalone package)
- Load a model:
from VnBSS import y_net_gr # or from models import y_net_gr
model = y_net_gr()
Examples can be found at y_net
>examples
. Also you can have a look at tcol.py
or example.py
, files which
computes the demos shown in the website.
Check a demo fully working:
Check out the Y-net-gr Docker branch
sudo docker run -p 8501:8501 --gpus all --rm -ti --ipc=host jfmontgar/acappella-y-net:y_net_gr
@inproceedings{acappella,
author = {Juan F. Montesinos and
Venkatesh S. Kadandale and
Gloria Haro},
title = {A cappella: Audio-visual Singing VoiceSeparation},
booktitle = {British Machine Vision Conference (BMVC)},
year = {2021},
}
.
.
.
.
.
.
The most difficult part is to prepare the dataset as everything is builded upon a very specific format.
Download the code and set your dataset paths at config>dataset_paths.json
To run training:
python run.py -m model_name --workname experiment_name --arxiv_path directory_of_experiments --pretrained_from path_pret_weights
You can inspect the argparse at default.py
>argparse_default
.
Possible model names are: y_net_g
, y_net_gr
, y_net_m
,y_net_r
,u_net
,llcp
- Go to
manuscript_scripts
and replace checkpoint paths by yours in the testing scripts. - Run:
bash manuscript_scripts/test_gr_r.sh
- Replace the paths of
manuscript_scripts/auto_metrics.py
by your experiment_directory path. - Run:
python manuscript_scripts/auto_metrics.py
to visualise results.
The best option to run the framework is to debug! Having a runable code helps to see input shapes, dataflow and to run line by line. Download The circle of life demo with the files already processed. It will act like a dataset of 6 samples. You can download it from Google Drive 1.1 Gb.
- Unzip the file
- run
python run.py -m y_net_gr
(for example) TODO :D
Everything has been configured to run by default this way.
Each effective model is wrapped by a nn.Module
which takes care of computing the STFT, the mask, returning the waveform
etcetera... This wrapper can be found at VnBSS
>models
>y_net.py
>YNet
. To get rid of this you can simply inherit the class,
take minimum layers and keep the core_forward
method, which is the inference step without the miscelanea.
Acappella's mock up files can be found in Gdrive.
Audioset's mock up files can be found in Gdrive
These files can be used to debug the code with a minimal dataset example.
To download the Acappella Dataset run the script at preproc
>preprocess.py
To download the demos used in the website run preproc
>demo_preprocessor.py
Audioset can be downloaded via webapp, streamlit run audioset.py
Demos shown in the website can be computed:
- The circle of life demo is obtained by running
tcol.py
. First turn the flagCOMPUTE=True
. To visualize the results turn the flagCOMPUTE=False
and run astreamlit run tcol.py
.
- How to change the optimizer's hyperparameters?
Go toconfig
>optimizer.json
- How to change clip duration, video framerate, STFT parameters or audio samplerate?
Go toconfig
>__init__.py
- How to change the batch size or the amount of epochs?
Go toconfig
>hyptrs.json
- How to dump predictions from the training and test set
Go todefault.py
. ModifyDUMP_FILES
(can be controlled at a subset level).force
argument skips the iteration-wise conditions and dumps for every single network prediction. - Is tensorboard enabled?
Yes, you will find tensorboard records atyour_experiment_directory/used_workname/tensorboard
- Can I resume an experiment?
Yes, if you set exactly the same experiment folder and workname, the system will detect it and will resume from there. - I'm trying to resume but found
AssertionError
If there is an exception before running the model - How to change the amount of layers of U-Net
U-net is build dynamically given a list of layers per block as shown inmodels
>__init__.py
from outer to inner blocks. - How to modify the default network values?
The json fileconfig
>net_cfg.json
overwrites any default configuration from the model.