Why does this lr_finder use training loss instead of validation loss? #29

alleno-yu · 2020-07-12T17:34:05Z

I have looked into the post "Estimating an Optimal Learning Rate For a Deep Neural Network", it suggested to use training loss to determine the best learning rate to use or a range of learning rate to use. However, in the paper "Cyclical Learning Rates for Training Neural Networks", the author used validation accuracy to find the learning rate range. So, in my humble opinion, lr_finder should evaluate val_loss after each batch and record it, then plot a graph using "validation loss" against "learning rate".

surmenok · 2020-07-12T21:33:23Z

I think your point is valid in general. However, we run only one epoch on training set. For the first epoch, train loss should be close to validation loss, if train set and validation set are drawn from the same distribution. So, the simplified method (that comes from Jeremy Howard's fast.ai course) could still be valid in many cases.
Would you mind creating a pull request for adding an option of using the validation set?

alleno-yu · 2020-07-12T22:04:42Z

Thank you for your respond, I'm in the middle of MSc Final Project. I'm new to github, if that's not too late, I can create a pull request after the project. But right now, my approach is very naive. Add validation_data to the init parameter calls:

def __init__(self, model, validation_data):
    self.model = model
    self.losses = []
    self.lrs = []
    self.best_loss = 1e9
    self.validation_data = validation_data

Then add following code under on_batch_end function:

    x, y = self.validation_data
    val_loss = self.model.evaluate(x, y, verbose=0)[0]
    loss = val_loss
    self.losses.append(loss)

Hope this will help, again thank you for your contribution!
The last question, should I close this issue?

surmenok · 2020-07-12T23:07:10Z

Let's keep it open until it's fixed.

tarasivashchuk · 2020-07-13T10:35:45Z

I might take a look at this today if I have some free time and submit the pull request. That is unless you have already started and wanted to finish it yourself @alleno-yu , let me know.

Otherwise, I think this is fairly trivial and it seems a potential solution would be to do something like basically instead of running one epoch, we decrease the number of steps per epoch to something like ~2-10 batches per epoch, and increase the number of epochs to number of batches // batches per epoch, and then essentially do the same logic, except using the on_epoch_end method to append the validation loss to the losses list. Thoughts?

And also, to @surmenok , what do you think should be the default functionality? I could do some quick testing if you guys want me to tackle this to gauge performance and accuracy differences, although it would be far from extensive and far from conclusive, but it would be something to go off of? Let me know, thanks!

Thanks guys, I don't have any professional work right now so I figure'd I'd contribute to some open source projects and work on some of my own.

alleno-yu · 2020-07-13T11:04:56Z

@tarasivashchuk I haven't started it, so feel free to help fix this issue

tarasivashchuk · 2020-07-13T11:40:31Z

@alleno-yu Ok, I'm going to wait to hear back from @surmenok to make sure he's on board with that solution

surmenok · 2020-08-23T03:28:59Z

Sorry for late response.
I think it's totally fine to add support of using validation set. It should be optional: the user can pass in validation set. If it's not passed in then training set is used.
As for number of epochs, we could make number of epochs configurable instead of hardcoding 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does this lr_finder use training loss instead of validation loss? #29

Why does this lr_finder use training loss instead of validation loss? #29

alleno-yu commented Jul 12, 2020

surmenok commented Jul 12, 2020

alleno-yu commented Jul 12, 2020 •

edited

Loading

surmenok commented Jul 12, 2020

tarasivashchuk commented Jul 13, 2020 •

edited

Loading

alleno-yu commented Jul 13, 2020

tarasivashchuk commented Jul 13, 2020

surmenok commented Aug 23, 2020

Why does this lr_finder use training loss instead of validation loss? #29

Why does this lr_finder use training loss instead of validation loss? #29

Comments

alleno-yu commented Jul 12, 2020

surmenok commented Jul 12, 2020

alleno-yu commented Jul 12, 2020 • edited Loading

surmenok commented Jul 12, 2020

tarasivashchuk commented Jul 13, 2020 • edited Loading

alleno-yu commented Jul 13, 2020

tarasivashchuk commented Jul 13, 2020

surmenok commented Aug 23, 2020

alleno-yu commented Jul 12, 2020 •

edited

Loading

tarasivashchuk commented Jul 13, 2020 •

edited

Loading