Wrong in Keras_train #5

Jangelaw · 2018-04-23T16:41:20Z

In Keras_train.py line 192: DataGen (), and preprocess.py line 39.
There is nothing in the self.mmdirs. Just wondering what is suppose to be in the mmdir?

jsleep · 2018-04-23T16:44:00Z

Yes, you need to run preprocess data first and run it on that. I did this project last summer and haven’t touched t in awhile but I will try to clean it up and provide good instructions / data for running it yourself sometime soon.

On Mon, Apr 23, 2018 at 9:41 AM Jangelaw ***@***.***> wrote: In Keras_train.py line 192: DataGen (), and preprocess.py line 39. There is nothing in the self.mmdirs. Just wondering what is suppose to be in the mmdir? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#5>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGPOipvxGo3tKGcM8GGrJvbDeX139ZLRks5trgQxgaJpZM4TgPTk> .

-- Jonathan Sleep

Jangelaw · 2018-04-23T17:59:29Z

I've already run the preprocess, but there is nothing is the mmdir.
I think maybe your code is suppose to create the input.dat in train, val and test file?
Also, there is some place which is not workable, you may double check the code of preprocess and improve it in the future.

Jangelaw · 2018-04-23T18:07:34Z

BTW, what's the corrent dtype in '.dat' file? uint8 or float?

jsleep · 2018-04-23T21:01:56Z

Yes honestly this repo is not in any state right now for public use, I will try to clean this up soon because I’ve been getting a lot of interested people contacting me recently

On Mon, Apr 23, 2018 at 11:07 AM Jangelaw ***@***.***> wrote: BTW, what's the corrent dtype in '.dat' file? uint8 or float? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGPOij48XwW_O9mmejPDktANGazlcOSMks5trhhngaJpZM4TgPTk> .

-- Jonathan Sleep

Jangelaw · 2018-04-24T12:16:08Z

I think you've already made a good start.
I've done some search online and I didn't see many useful deep-learning-based code.
So, if you can improve it, it will very helpful for other researchers to do some work on it.
Recently，I'm preparing a paper on AMT and I need to implement the code of the work 'An End-to-End Neural Network for Polyphonic Piano Music Transcription'. Your code is the only one I can find, which is close to this work. That's why I always work on your code :P
BTW, if you have any AMT-related publication, I'd like to cite your work as well :)

jsleep · 2018-04-24T15:42:47Z

Yes my work is heavily based off that paper as well. I'll send you the link to my Master's thesis once it ever gets published to my school's digital commons. I'd also look into John Thickstun's work (MIREX winner / MusicNet Dataset @ University of Washington), as well as what Magenta (Google) is doing. Jonathan Sleep California Polytechnic State University Computer Engineering / Computer Science BS/MSc Class of 2017

…

On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***> wrote: I think you've already made a good start. I've done some search online and I didn't see many useful deep-learning-based code. So, if you can improve it, it will very helpful for other researchers to do some work on it. Recently，I'm preparing a paper on AMT and I need to implement the code of the work 'An End-to-End Neural Network for Polyphonic Piano Music Transcription'. Your code is the only one I can find, which is close to this work. That's why I always work on your code :P BTW, if you have any AMT-related publication, I'd like to cite your work as well :) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk> .

Jangelaw · 2018-05-16T08:51:56Z

Dear Jon, I try to use your code train the model of MAPS database, however, I meet several problems. 1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano Music Transcription', I was trying to use 210 music pieces as train sample and 60 music pieces as test sample. However, since the data is too large, it will cause memory problem. Therefore, I have to split the training data into 2 or 3 parts and train them part-by-part. 2) During the training stage, I notice that the loss doesn't decrease and sometimes even increase at every epoch. And the accuracy doesn't change after second epoch. 3) Since the loss and accuracy have some problems, the accuracy of the test result is almost 90% which I think is totally wrong since the result in that paper is no more than 60% I don't know if you have the same issue. I think the problem may be the overfitting. Regards, Yijun

…

On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote: Yes my work is heavily based off that paper as well. I'll send you the link to my Master's thesis once it ever gets published to my school's digital commons. I'd also look into John Thickstun's work (MIREX winner / MusicNet Dataset @ University of Washington), as well as what Magenta (Google) is doing. Jonathan Sleep California Polytechnic State University Computer Engineering / Computer Science BS/MSc Class of 2017 On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***> wrote: > I think you've already made a good start. > I've done some search online and I didn't see many useful > deep-learning-based code. > So, if you can improve it, it will very helpful for other researchers to > do some work on it. > Recently，I'm preparing a paper on AMT and I need to implement the code of > the work 'An End-to-End Neural Network for Polyphonic Piano Music > Transcription'. Your code is the only one I can find, which is close to > this work. That's why I always work on your code :P > BTW, if you have any AMT-related publication, I'd like to cite your work > as well :) > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#5 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_ SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk> .

jsleep · 2018-05-16T16:06:18Z

Yes the MAPs dataset is pretty hefty to store in memory - even with just the 210 performances. I had 32 GB of RAM but I believe I still had to set up batching to split the data in atleast half. You should look at my data generator and adjust to you RAM requirements accordingly. To answer 2 and 3 - my loss went down over time, very rapidly at first at then gradually. The test accuracy from Keras is wrong - but the paper doesn't mention their accuracy at all either (just precision, recall, F1). Keras shows the test accuracy being so high because its evaluates it in a weird way that isn't correct for an multi-label problem (which this is). A majority of labels will be inactive for the dataset so they are skewed to be always off. A system that guesses 0 all the time for all of these labels (and thus predicts no notes are ever played), which score a very high accuracy still. Finally, I'm in the process of cleaning this repo up and making it more usable. It's in a completely unusable state for an outsider, so please, be patient. Jonathan Sleep California Polytechnic State University Computer Engineering / Computer Science BS/MSc Class of 2017

…

On Wed, May 16, 2018 at 1:51 AM, Jangelaw ***@***.***> wrote: Dear Jon, I try to use your code train the model of MAPS database, however, I meet several problems. 1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano Music Transcription', I was trying to use 210 music pieces as train sample and 60 music pieces as test sample. However, since the data is too large, it will cause memory problem. Therefore, I have to split the training data into 2 or 3 parts and train them part-by-part. 2) During the training stage, I notice that the loss doesn't decrease and sometimes even increase at every epoch. And the accuracy doesn't change after second epoch. 3) Since the loss and accuracy have some problems, the accuracy of the test result is almost 90% which I think is totally wrong since the result in that paper is no more than 60% I don't know if you have the same issue. I think the problem may be the overfitting. Regards, Yijun On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote: > Yes my work is heavily based off that paper as well. I'll send you the link > to my Master's thesis once it ever gets published to my school's digital > commons. > > I'd also look into John Thickstun's work (MIREX winner / MusicNet Dataset @ > University of Washington), as well as what Magenta (Google) is doing. > > Jonathan Sleep > California Polytechnic State University > Computer Engineering / Computer Science BS/MSc Class of 2017 > > On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***> > wrote: > > > I think you've already made a good start. > > I've done some search online and I didn't see many useful > > deep-learning-based code. > > So, if you can improve it, it will very helpful for other researchers to > > do some work on it. > > Recently，I'm preparing a paper on AMT and I need to implement the code of > > the work 'An End-to-End Neural Network for Polyphonic Piano Music > > Transcription'. Your code is the only one I can find, which is close to > > this work. That's why I always work on your code :P > > BTW, if you have any AMT-related publication, I'd like to cite your work > > as well :) > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#5 (comment)>, or > mute > > the thread > > <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_ > SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk> > > . > > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#5 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_ bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk> > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGPOit2sv-84ZFN5qyeOGUz8SJyJzz_Cks5ty-itgaJpZM4TgPTk> .

Jangelaw · 2018-05-21T09:39:20Z

Dear Jon, Thanks for your reply. You are right, right now, my loss seems to be normal. But as you said, the accuracy is wrong and the system always guesses 0... Looking forward to your update. I will also try to fix the issue. I will let you know if I have some findings. Many thanks, Yijun

…

On 16 May 2018 at 17:06, Jon Sleep ***@***.***> wrote: Yes the MAPs dataset is pretty hefty to store in memory - even with just the 210 performances. I had 32 GB of RAM but I believe I still had to set up batching to split the data in atleast half. You should look at my data generator and adjust to you RAM requirements accordingly. To answer 2 and 3 - my loss went down over time, very rapidly at first at then gradually. The test accuracy from Keras is wrong - but the paper doesn't mention their accuracy at all either (just precision, recall, F1). Keras shows the test accuracy being so high because its evaluates it in a weird way that isn't correct for an multi-label problem (which this is). A majority of labels will be inactive for the dataset so they are skewed to be always off. A system that guesses 0 all the time for all of these labels (and thus predicts no notes are ever played), which score a very high accuracy still. Finally, I'm in the process of cleaning this repo up and making it more usable. It's in a completely unusable state for an outsider, so please, be patient. Jonathan Sleep California Polytechnic State University Computer Engineering / Computer Science BS/MSc Class of 2017 On Wed, May 16, 2018 at 1:51 AM, Jangelaw ***@***.***> wrote: > Dear Jon, > > I try to use your code train the model of MAPS database, however, I meet > several problems. > 1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano > Music Transcription', I was trying to use 210 music pieces as train sample > and 60 music pieces as test sample. > However, since the data is too large, it will cause memory problem. > Therefore, I have to split the training data into 2 or 3 parts and train > them part-by-part. > 2) During the training stage, I notice that the loss doesn't decrease and > sometimes even increase at every epoch. And the accuracy doesn't change > after second epoch. > 3) Since the loss and accuracy have some problems, the accuracy of the test > result is almost 90% which I think is totally wrong since the result in > that paper is no more than 60% > > I don't know if you have the same issue. I think the problem may be the > overfitting. > > Regards, > Yijun > > On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote: > > > Yes my work is heavily based off that paper as well. I'll send you the > link > > to my Master's thesis once it ever gets published to my school's digital > > commons. > > > > I'd also look into John Thickstun's work (MIREX winner / MusicNet > Dataset @ > > University of Washington), as well as what Magenta (Google) is doing. > > > > Jonathan Sleep > > California Polytechnic State University > > Computer Engineering / Computer Science BS/MSc Class of 2017 > > > > On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***> > > wrote: > > > > > I think you've already made a good start. > > > I've done some search online and I didn't see many useful > > > deep-learning-based code. > > > So, if you can improve it, it will very helpful for other researchers > to > > > do some work on it. > > > Recently，I'm preparing a paper on AMT and I need to implement the code > of > > > the work 'An End-to-End Neural Network for Polyphonic Piano Music > > > Transcription'. Your code is the only one I can find, which is close to > > > this work. That's why I always work on your code :P > > > BTW, if you have any AMT-related publication, I'd like to cite your > work > > > as well :) > > > > > > — > > > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > > > <#5 (comment)>, > or > > mute > > > the thread > > > <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_ > > SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk> > > > . > > > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#5 (comment)>, or > mute > > the thread > > <https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_ > bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk> > > . > > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#5 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AGPOit2sv- 84ZFN5qyeOGUz8SJyJzz_Cks5ty-itgaJpZM4TgPTk> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMafKxEs6Le2_T7Uvx_OUObRAeXCN73wks5tzE57gaJpZM4TgPTk> .

justhalf · 2018-09-20T12:42:20Z

Yes honestly this repo is not in any state right now for public use, I will try to clean this up soon because I’ve been getting a lot of interested people contacting me recently
On Mon, Apr 23, 2018 at 11:07 AM Jangelaw @.***> wrote: BTW, what's the corrent dtype in '.dat' file? uint8 or float? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AGPOij48XwW_O9mmejPDktANGazlcOSMks5trhhngaJpZM4TgPTk .
-- Jonathan Sleep

Is it happening soon? I'm looking at the code and it seems the code is not runnable (missing variables) as is, I needed to make a few adjustments (updating args with config, for example).

whartley94 · 2019-02-05T11:27:49Z

Any chance we can get an update on this? Think you've made a great start but would be great if it were more user-friendly! :)

jsleep · 2019-02-05T19:24:34Z

No update unfortunately, been busy at my full time job and haven't had the drive to program in my off hours. However, I'm currently in the process of switching jobs and if I get a decent break I will for sure be working on this and making a consumable project. I've had fixing this project up in the back of my head for so long now...

czonios · 2019-03-02T19:21:52Z

Hey,

I have gotten this project to work, however I used Python 3 and refactored the code completely in the process of fixing and understanding it, using a very different coding style so I'm unsure if I should even do a pull request with these radical changes.
Even if I were to, I've switched away from this project (because I'm implementing something similar in PyTorch instead of Keras, using a different dataset file format, different spectrogram type, different model etc.)

I would be glad, when I get home, to take a look at my notes and give you some info on what changes I made in order for it to work.
I could also clone and push all my changes to my own repository, and then @jsleep could decide whether he wants to merge it or not - the caveat here is I can't put in the time required to test if it's working besides the bare minimum. I can't make any promises about this since I'll need a considerable amount of time to go through it again before I can comfortably push it.

Let me know what you'd prefer and if enough people are interested and @jsleep agrees, I'll see what I can do.

Christos

jseales · 2020-06-26T16:25:27Z

Hi czonios, I'm working on a similar project. I'd be interested in seeing your reworked code. Let me know!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong in Keras_train #5

Wrong in Keras_train #5

Jangelaw commented Apr 23, 2018

jsleep commented Apr 23, 2018 via email

Jangelaw commented Apr 23, 2018

Jangelaw commented Apr 23, 2018

jsleep commented Apr 23, 2018 via email

Jangelaw commented Apr 24, 2018

jsleep commented Apr 24, 2018 via email

Jangelaw commented May 16, 2018 via email

jsleep commented May 16, 2018 via email

Jangelaw commented May 21, 2018 via email

justhalf commented Sep 20, 2018

whartley94 commented Feb 5, 2019

jsleep commented Feb 5, 2019

czonios commented Mar 2, 2019

jseales commented Jun 26, 2020

Wrong in Keras_train #5

Wrong in Keras_train #5

Comments

Jangelaw commented Apr 23, 2018

jsleep commented Apr 23, 2018 via email

Jangelaw commented Apr 23, 2018

Jangelaw commented Apr 23, 2018

jsleep commented Apr 23, 2018 via email

Jangelaw commented Apr 24, 2018

jsleep commented Apr 24, 2018 via email

Jangelaw commented May 16, 2018 via email

jsleep commented May 16, 2018 via email

Jangelaw commented May 21, 2018 via email

justhalf commented Sep 20, 2018

whartley94 commented Feb 5, 2019

jsleep commented Feb 5, 2019

czonios commented Mar 2, 2019

jseales commented Jun 26, 2020