-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong in Keras_train #5
Comments
Yes, you need to run preprocess data first and run it on that. I did this
project last summer and haven’t touched t in awhile but I will try to clean
it up and provide good instructions / data for running it yourself sometime
soon.
On Mon, Apr 23, 2018 at 9:41 AM Jangelaw ***@***.***> wrote:
In Keras_train.py line 192: DataGen (), and preprocess.py line 39.
There is nothing in the self.mmdirs. Just wondering what is suppose to be
in the mmdir?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGPOipvxGo3tKGcM8GGrJvbDeX139ZLRks5trgQxgaJpZM4TgPTk>
.
--
Jonathan Sleep
|
I've already run the preprocess, but there is nothing is the mmdir. |
BTW, what's the corrent dtype in '.dat' file? uint8 or float? |
Yes honestly this repo is not in any state right now for public use, I will
try to clean this up soon because I’ve been getting a lot of interested
people contacting me recently
On Mon, Apr 23, 2018 at 11:07 AM Jangelaw ***@***.***> wrote:
BTW, what's the corrent dtype in '.dat' file? uint8 or float?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGPOij48XwW_O9mmejPDktANGazlcOSMks5trhhngaJpZM4TgPTk>
.
--
Jonathan Sleep
|
I think you've already made a good start. |
Yes my work is heavily based off that paper as well. I'll send you the link
to my Master's thesis once it ever gets published to my school's digital
commons.
I'd also look into John Thickstun's work (MIREX winner / MusicNet Dataset @
University of Washington), as well as what Magenta (Google) is doing.
Jonathan Sleep
California Polytechnic State University
Computer Engineering / Computer Science BS/MSc Class of 2017
…On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***> wrote:
I think you've already made a good start.
I've done some search online and I didn't see many useful
deep-learning-based code.
So, if you can improve it, it will very helpful for other researchers to
do some work on it.
Recently,I'm preparing a paper on AMT and I need to implement the code of
the work 'An End-to-End Neural Network for Polyphonic Piano Music
Transcription'. Your code is the only one I can find, which is close to
this work. That's why I always work on your code :P
BTW, if you have any AMT-related publication, I'd like to cite your work
as well :)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk>
.
|
Dear Jon,
I try to use your code train the model of MAPS database, however, I meet
several problems.
1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano
Music Transcription', I was trying to use 210 music pieces as train sample
and 60 music pieces as test sample.
However, since the data is too large, it will cause memory problem.
Therefore, I have to split the training data into 2 or 3 parts and train
them part-by-part.
2) During the training stage, I notice that the loss doesn't decrease and
sometimes even increase at every epoch. And the accuracy doesn't change
after second epoch.
3) Since the loss and accuracy have some problems, the accuracy of the test
result is almost 90% which I think is totally wrong since the result in
that paper is no more than 60%
I don't know if you have the same issue. I think the problem may be the
overfitting.
Regards,
Yijun
…On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote:
Yes my work is heavily based off that paper as well. I'll send you the link
to my Master's thesis once it ever gets published to my school's digital
commons.
I'd also look into John Thickstun's work (MIREX winner / MusicNet Dataset @
University of Washington), as well as what Magenta (Google) is doing.
Jonathan Sleep
California Polytechnic State University
Computer Engineering / Computer Science BS/MSc Class of 2017
On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***>
wrote:
> I think you've already made a good start.
> I've done some search online and I didn't see many useful
> deep-learning-based code.
> So, if you can improve it, it will very helpful for other researchers to
> do some work on it.
> Recently,I'm preparing a paper on AMT and I need to implement the code of
> the work 'An End-to-End Neural Network for Polyphonic Piano Music
> Transcription'. Your code is the only one I can find, which is close to
> this work. That's why I always work on your code :P
> BTW, if you have any AMT-related publication, I'd like to cite your work
> as well :)
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#5 (comment)>, or
mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_
SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk>
.
|
Yes the MAPs dataset is pretty hefty to store in memory - even with just
the 210 performances. I had 32 GB of RAM but I believe I still had to set
up batching to split the data in atleast half. You should look at my data
generator and adjust to you RAM requirements accordingly.
To answer 2 and 3 - my loss went down over time, very rapidly at first at
then gradually. The test accuracy from Keras is wrong - but the paper
doesn't mention their accuracy at all either (just precision, recall, F1).
Keras shows the test accuracy being so high because its evaluates it in a
weird way that isn't correct for an multi-label problem (which this is). A
majority of labels will be inactive for the dataset so they are skewed to
be always off. A system that guesses 0 all the time for all of these labels
(and thus predicts no notes are ever played), which score a very high
accuracy still.
Finally, I'm in the process of cleaning this repo up and making it more
usable. It's in a completely unusable state for an outsider, so please, be
patient.
Jonathan Sleep
California Polytechnic State University
Computer Engineering / Computer Science BS/MSc Class of 2017
…On Wed, May 16, 2018 at 1:51 AM, Jangelaw ***@***.***> wrote:
Dear Jon,
I try to use your code train the model of MAPS database, however, I meet
several problems.
1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano
Music Transcription', I was trying to use 210 music pieces as train sample
and 60 music pieces as test sample.
However, since the data is too large, it will cause memory problem.
Therefore, I have to split the training data into 2 or 3 parts and train
them part-by-part.
2) During the training stage, I notice that the loss doesn't decrease and
sometimes even increase at every epoch. And the accuracy doesn't change
after second epoch.
3) Since the loss and accuracy have some problems, the accuracy of the test
result is almost 90% which I think is totally wrong since the result in
that paper is no more than 60%
I don't know if you have the same issue. I think the problem may be the
overfitting.
Regards,
Yijun
On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote:
> Yes my work is heavily based off that paper as well. I'll send you the
link
> to my Master's thesis once it ever gets published to my school's digital
> commons.
>
> I'd also look into John Thickstun's work (MIREX winner / MusicNet
Dataset @
> University of Washington), as well as what Magenta (Google) is doing.
>
> Jonathan Sleep
> California Polytechnic State University
> Computer Engineering / Computer Science BS/MSc Class of 2017
>
> On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***>
> wrote:
>
> > I think you've already made a good start.
> > I've done some search online and I didn't see many useful
> > deep-learning-based code.
> > So, if you can improve it, it will very helpful for other researchers
to
> > do some work on it.
> > Recently,I'm preparing a paper on AMT and I need to implement the code
of
> > the work 'An End-to-End Neural Network for Polyphonic Piano Music
> > Transcription'. Your code is the only one I can find, which is close to
> > this work. That's why I always work on your code :P
> > BTW, if you have any AMT-related publication, I'd like to cite your
work
> > as well :)
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#5 (comment)>,
or
> mute
> > the thread
> > <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_
> SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk>
> > .
>
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#5 (comment)>, or
mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_
bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGPOit2sv-84ZFN5qyeOGUz8SJyJzz_Cks5ty-itgaJpZM4TgPTk>
.
|
Dear Jon,
Thanks for your reply.
You are right, right now, my loss seems to be normal. But as you said, the
accuracy is wrong and the system always guesses 0...
Looking forward to your update.
I will also try to fix the issue. I will let you know if I have some
findings.
Many thanks,
Yijun
…On 16 May 2018 at 17:06, Jon Sleep ***@***.***> wrote:
Yes the MAPs dataset is pretty hefty to store in memory - even with just
the 210 performances. I had 32 GB of RAM but I believe I still had to set
up batching to split the data in atleast half. You should look at my data
generator and adjust to you RAM requirements accordingly.
To answer 2 and 3 - my loss went down over time, very rapidly at first at
then gradually. The test accuracy from Keras is wrong - but the paper
doesn't mention their accuracy at all either (just precision, recall, F1).
Keras shows the test accuracy being so high because its evaluates it in a
weird way that isn't correct for an multi-label problem (which this is). A
majority of labels will be inactive for the dataset so they are skewed to
be always off. A system that guesses 0 all the time for all of these labels
(and thus predicts no notes are ever played), which score a very high
accuracy still.
Finally, I'm in the process of cleaning this repo up and making it more
usable. It's in a completely unusable state for an outsider, so please, be
patient.
Jonathan Sleep
California Polytechnic State University
Computer Engineering / Computer Science BS/MSc Class of 2017
On Wed, May 16, 2018 at 1:51 AM, Jangelaw ***@***.***>
wrote:
> Dear Jon,
>
> I try to use your code train the model of MAPS database, however, I meet
> several problems.
> 1) Based on the paper 'An End-to-End Neural Network for Polyphonic Piano
> Music Transcription', I was trying to use 210 music pieces as train
sample
> and 60 music pieces as test sample.
> However, since the data is too large, it will cause memory problem.
> Therefore, I have to split the training data into 2 or 3 parts and train
> them part-by-part.
> 2) During the training stage, I notice that the loss doesn't decrease and
> sometimes even increase at every epoch. And the accuracy doesn't change
> after second epoch.
> 3) Since the loss and accuracy have some problems, the accuracy of the
test
> result is almost 90% which I think is totally wrong since the result in
> that paper is no more than 60%
>
> I don't know if you have the same issue. I think the problem may be the
> overfitting.
>
> Regards,
> Yijun
>
> On 24 April 2018 at 16:42, Jon Sleep ***@***.***> wrote:
>
> > Yes my work is heavily based off that paper as well. I'll send you the
> link
> > to my Master's thesis once it ever gets published to my school's
digital
> > commons.
> >
> > I'd also look into John Thickstun's work (MIREX winner / MusicNet
> Dataset @
> > University of Washington), as well as what Magenta (Google) is doing.
> >
> > Jonathan Sleep
> > California Polytechnic State University
> > Computer Engineering / Computer Science BS/MSc Class of 2017
> >
> > On Tue, Apr 24, 2018 at 5:16 AM, Jangelaw ***@***.***>
> > wrote:
> >
> > > I think you've already made a good start.
> > > I've done some search online and I didn't see many useful
> > > deep-learning-based code.
> > > So, if you can improve it, it will very helpful for other researchers
> to
> > > do some work on it.
> > > Recently,I'm preparing a paper on AMT and I need to implement the
code
> of
> > > the work 'An End-to-End Neural Network for Polyphonic Piano Music
> > > Transcription'. Your code is the only one I can find, which is close
to
> > > this work. That's why I always work on your code :P
> > > BTW, if you have any AMT-related publication, I'd like to cite your
> work
> > > as well :)
> > >
> > > —
> > > You are receiving this because you commented.
> > > Reply to this email directly, view it on GitHub
> > > <#5 (comment)>,
> or
> > mute
> > > the thread
> > > <https://github.com/notifications/unsubscribe-auth/AGPOir1pLac7I_7Z_
> > SueibB6VdXt1G6Cks5trxeJgaJpZM4TgPTk>
> > > .
> >
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#5 (comment)>,
or
> mute
> > the thread
> > <https://github.com/notifications/unsubscribe-auth/AMafK12a-6I9wHeUTj_
> bRwpfSmLe2Veoks5tr0f5gaJpZM4TgPTk>
> > .
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#5 (comment)>, or
mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AGPOit2sv-
84ZFN5qyeOGUz8SJyJzz_Cks5ty-itgaJpZM4TgPTk>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AMafKxEs6Le2_T7Uvx_OUObRAeXCN73wks5tzE57gaJpZM4TgPTk>
.
|
Is it happening soon? I'm looking at the code and it seems the code is not runnable (missing variables) as is, I needed to make a few adjustments (updating |
Any chance we can get an update on this? Think you've made a great start but would be great if it were more user-friendly! :) |
No update unfortunately, been busy at my full time job and haven't had the drive to program in my off hours. However, I'm currently in the process of switching jobs and if I get a decent break I will for sure be working on this and making a consumable project. I've had fixing this project up in the back of my head for so long now... |
Hey, I have gotten this project to work, however I used Python 3 and refactored the code completely in the process of fixing and understanding it, using a very different coding style so I'm unsure if I should even do a pull request with these radical changes. I would be glad, when I get home, to take a look at my notes and give you some info on what changes I made in order for it to work. Let me know what you'd prefer and if enough people are interested and @jsleep agrees, I'll see what I can do. Christos |
Hi czonios, I'm working on a similar project. I'd be interested in seeing your reworked code. Let me know! |
In Keras_train.py line 192: DataGen (), and preprocess.py line 39.
There is nothing in the self.mmdirs. Just wondering what is suppose to be in the mmdir?
The text was updated successfully, but these errors were encountered: