VGGish

VGGish are features from a pretrained CNN by Google (research paper). Apple has a nice comprehensible explanation.

They benchmark their approach against Audio Set (obs innehåller även djurljud!). It seems to be just tags to YouTube videos?

AED: Acoustic Event Detection
VGGish: Seem to be feature from a pretrained CNN? Not sure, but link to repo here

Where do I find the code?

Currently working on data preprocessing in data.ipynb.

Download the OpenMIC-2018 dataset and add as a subfolder to data (data/openmic-2018/all/goes/here)
Make sure you have Docker up and running, and open the project in Visual Studio Code.
VSC should prompt for Open project in .devcontainer?
Accept.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.devcontainer		.devcontainer
.vscode		.vscode
data		data
imgs		imgs
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt