Clustering of multiple-event online sound collections with the codebook approach

This is the accompanying codebase to my Master Thesis in SMC

Custom dataset: acoustic scenes

https://drive.google.com/open?id=1LzMi7FHU5lUZxLiCl1jefBxIBVZG8xGv6mx4czmhm6Y

freesound_processing_pipeline folder

Given a dataset of feature vectors series, the codebook will be generated. This codebook will be used to encode the feature vectors, so each frame will be encoded as its nearest codeword in the feature space. This encoded vector can be treated analogous to a text, where each codeword is analog to a natural language word. In our particular processing pipeline, TF-IDF is performed in order to reward less common codewords. Since the output of TF-IDF is a fixed-sized vector, a similarity matrix can be computed. This similarity matrix is used by the clustering algorithm to finally obtain the clusters.

The presented code is modular enough to allow the use of different NLP algorithms in order to produce alternative similarity matrices. For example, histogram intersection was used in a previous iteration instead of TF-IDF. Different clustering techniques are also easy to plug-in (as an example, k-means was previously used in an earlier iteration before switching to k-nn graph-based clustering which proved to be much more faster and easier to render in the web browser).

In this graph, each node corresponds to a Freesound clip.

Clusters in the graph are identified using a Louvain community detection algorithm implementation with the NetworkX Python package.

Setup

Install dependencies in a virtual environment:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Copy the settings.example.py file to settings.py, open it, and follow the instructions. Here we basically configure were the audio features of the sound files have to be located in the computer.
```
cp settings.example.py settings.py
```
As you might now understand, you need to have a folder containing the pre-calculated audio features for the sounds in the different datasets. There are in total around 30k sounds (with freesound IDs in all_sound_ids.json) and 45 datasets (JSON files in datasets/).
You can start the clusterings by typing:
```
python clustering.py
```
This will output some results in the console. (TODO: add stats e.g. num clusters, num sounds, ...). It will also save the clustered graph so that we can visualise them in a 2D representation.
You can start the visualisation web server by typing:
```
python web-visu/start_server.py
```
Then you can access the web app from your browser at: http://localhost:8100/web-visu/.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
files		files
freesound_processing_pipeline		freesound_processing_pipeline
1_Create_source_sound_collection.ipynb		1_Create_source_sound_collection.ipynb
1_b_Create_source_sound_collection_from_id_list_and_get_MFCCs.ipynb		1_b_Create_source_sound_collection_from_id_list_and_get_MFCCs.ipynb
2_Analyze_source_collection_dimensionality_reduction_and_plot.ipynb		2_Analyze_source_collection_dimensionality_reduction_and_plot.ipynb
README.md		README.md
docker-compose.yml		docker-compose.yml
download_fs_utils.py		download_fs_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clustering of multiple-event online sound collections with the codebook approach

Custom dataset: acoustic scenes

freesound_processing_pipeline folder

Setup

About

Releases

Packages

Languages

lluissuros/codebook-approach

Folders and files

Latest commit

History

Repository files navigation

Clustering of multiple-event online sound collections with the codebook approach

Custom dataset: acoustic scenes

freesound_processing_pipeline folder

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages