'''Train a simple deep CNN on the CIFAR10 small images dataset." after a while it gets worse #7384

ehfo0 · 2017-07-20T12:12:40Z

I run the code it went high as 0.79% accuracy in 100 epoc but then it started to get worse and the loss goes back to what it started!(1.8) how to prevent that?

mrTsjolder · 2017-07-20T20:44:28Z

Accuracy on the validation set or on the training set?

ehfo0 · 2017-07-21T10:20:18Z

these are the result after 200 epoc:
loss: 1.2848 - acc: 0.5930 - val_loss: 1.3541 - val_acc: 0.5902
but these were the result after 80 epoc:
loss: 0.7646 acc: 0.7389 - val_loss: 0.6494 - val_acc: 0.7852

mrTsjolder · 2017-07-21T17:14:07Z

Are you running on theano, tensorflow or cntk backend?
What device are you running the computations on?
Did you use the data-augmentation?
Did you change anything in the code?

ehfo0 · 2017-07-21T17:20:03Z

tensorflow 1.2
laptop geforce gt970x
16 gig Ram
but it says

Total memory: 3.00GiB
Free memory: 2.48GiB

and no I didn't change anything in the code

mrTsjolder · 2017-07-21T19:48:51Z

I started a run using the Theano backend and I am getting similar results.
I do not see any problems in the code of the example and the implementation of RMSProp looks fine to me as well, so it might just be a problem that is related to RMSProp.

From the code, it seems that this might be caused by vanishing gradients, because if g ≈ 0, new_a will converge to 0.9 ** iterations, which gets already quite close to the machine precision (≈1.19e-7 for 32 bit floats) for iterations = 100. Dividing through this small number, might then lead to numerical instabilities that cause such issues.

This being said, I just realised that the default value for the numerical constantepsilon = 1e-8 is useless when using 32 bit floats. Maybe a better choice for epsilon might resolve this issue.

Maybe you can try another optimizer to see if this is indeed a problem with rmsprop. However, it seems that all adaptive learning rate optimizers use the same epsilon that might just cause the same problems. Therefore, it might be wiser to just choose epsilon to be something like 1e-7 or maybe even 2e-7.

PS: I found this similar issue for torch

stale · 2017-10-19T20:12:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale bot added the stale label Oct 19, 2017

stale bot closed this as completed Nov 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'''Train a simple deep CNN on the CIFAR10 small images dataset." after a while it gets worse #7384

'''Train a simple deep CNN on the CIFAR10 small images dataset." after a while it gets worse #7384

ehfo0 commented Jul 20, 2017

mrTsjolder commented Jul 20, 2017

ehfo0 commented Jul 21, 2017

mrTsjolder commented Jul 21, 2017

ehfo0 commented Jul 21, 2017 •

edited

Loading

mrTsjolder commented Jul 21, 2017 •

edited

Loading

stale bot commented Oct 19, 2017

'''Train a simple deep CNN on the CIFAR10 small images dataset." after a while it gets worse #7384

'''Train a simple deep CNN on the CIFAR10 small images dataset." after a while it gets worse #7384

Comments

ehfo0 commented Jul 20, 2017

mrTsjolder commented Jul 20, 2017

ehfo0 commented Jul 21, 2017

mrTsjolder commented Jul 21, 2017

ehfo0 commented Jul 21, 2017 • edited Loading

mrTsjolder commented Jul 21, 2017 • edited Loading

stale bot commented Oct 19, 2017

ehfo0 commented Jul 21, 2017 •

edited

Loading

mrTsjolder commented Jul 21, 2017 •

edited

Loading