This repository is for training neural networks and regressive models on the music datasets from "Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription" by Boulanger-Lewandowski et al, appearing in ICML 2012.
We use pytorch
and sklearn
for training models. sacred
is used for logging both training sessions and creating new music. We use mido
for transcribing the piano-roll format to midi, and timidity
is optionally used for converting midi to wav.
The original datasets in matlab format can be found in data/
. The trained models and their metrics can be found in models/
.
The songs generated by the trained models may be found in music/
; there is a sacred system set up for logging new music but there is also a bunch of old music files in music/old/
.
src/midi/
contains the script for converting between piano-roll and midi and synthesizing new music from trained models, src/neural_nets/
defines and initializes pytorch
modules and data loaders to be trained, src/regression/
contains a script for training a logistic model to predict the music.
The sacred scripts for training models and generating music can be found in src/
. These scripts contain the default configurations for their respective functions as well as documentation for each of the parameters.
Executing these scripts should be done by running python train_example.py
and python music_example.py
, adjustments to the default configurations should be written into these scripts.
As of the last update to this README, the best model for the Nottingham
dataset is in models/209/
and the best model for the JSB_Chorales
dataset is in models/216/
.