This is the final project I did when I was studying the MIR course in NTHU. I decided to do make a dataset full of screaming vocals samples, because this kind of vocals is less concerned in the MIR field. The making of this dataset may be a starting point for future research.
I pick up 16 songs as listed below from The 'Mixing Secrets' Free Multitrack Download Library, and used audacity to slice vocal tracks into small clips, ended up with 565 of samples.
The Apprehended: 'Still Flyin'
Cnoc An Tursa: 'Bannockburn'
The Complainiacs: 'Etc'
Dark Ride: 'Burning Bridges'
Dark Ride: 'Hammer Down'
Dark Ride: 'Piece Of Me'
Death Of A Romantic: 'The Well'
Decypher: 'Unseen'
Headwound Harry: 'XXXV'
Hollow Ground: 'Ill Fate'
Hollow Ground: 'Left Blind'
Last Legacy: 'Who's Who In Hell'
Storm Of Particles: 'Of Ice And Hopeless Fate'
Timboz: 'Pony'
Titanium: 'Haunted Age'
Wall Of Death: 'Femme'
I named each clips using initials, so the first clips from the track 43_Vox18.wav
in The Apprehended - Still Flyin
is named as TA_SF_43_Vox18_01.wav
, etc. I can't provide the raw audio, but you can use the files in mark_time
which contain the start time and end time of each clips, to manually recreate them.
- annotations: the vocal techniques of each samples, labeled by metal youtuber Edward.
- features: audio features (except chroma related) extracted with librosa.
Require pandas, sklearn, matplotlib and librosa be installed.
- get_features: features extraction from the audio (used to create
features.csv
). Most of the code are borrowed from FMA. - analysis: show the melspectrogram of some samples. Require raw audio files.
- baseline: baseline svm model for vocal techniques recognition.
- ubuntu 16.04 LTS
- python3.5.2 (using Pycharm 2018.1.4)
I also wrote a report in chinese about the making process.