-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to define the "converge" of the training loss #58
Comments
we were using msra2500k as the validation set. But it couldn't work. Since
different datasets have different distribution and we found it is hard to
overfit on DUTS-TR. You can plot the curve of the -log(loss) and observe
the converging trend. Notes: Directly plotting the loss curve won't
indicate the loss decrease in late stage because most of the loss
decreasing comes from the fine structures, which are usually very tiny and
unobservable in the direct loss plotting. Of course, if you are training
the model with your own data, you can just add the validation step and also
plot the -log(loss) to show the trend.
…On Wed, Jun 16, 2021 at 11:25 AM clelouch ***@***.***> wrote:
Thanks for your code and paper.
I notice that there is no validation set in the training stage. and the
training process is stopeed when the loss converges. I am curious how to
define the "converge" and avoid overfitting, since the loss may fluctuates.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#58>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORKWFSOJ2VB4TC465DLTTBGVXANCNFSM46YZGFGA>
.
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
Besides, a well-defined accuracy metric is suggested. Since loss doesn't
always indicate the exact performance you want. Sometimes, validation loss
may increase but it doesn't mean the model is overfitting. It also depends
on your evaluation metrics.
…On Wed, Jun 16, 2021 at 11:31 AM Xuebin Qin ***@***.***> wrote:
we were using msra2500k as the validation set. But it couldn't work. Since
different datasets have different distribution and we found it is hard to
overfit on DUTS-TR. You can plot the curve of the -log(loss) and observe
the converging trend. Notes: Directly plotting the loss curve won't
indicate the loss decrease in late stage because most of the loss
decreasing comes from the fine structures, which are usually very tiny and
unobservable in the direct loss plotting. Of course, if you are training
the model with your own data, you can just add the validation step and also
plot the -log(loss) to show the trend.
On Wed, Jun 16, 2021 at 11:25 AM clelouch ***@***.***>
wrote:
> Thanks for your code and paper.
> I notice that there is no validation set in the training stage. and the
> training process is stopeed when the loss converges. I am curious how to
> define the "converge" and avoid overfitting, since the loss may fluctuates.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#58>, or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADSGORKWFSOJ2VB4TC465DLTTBGVXANCNFSM46YZGFGA>
> .
>
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
Thanks for your kind help. |
Yes, for all the models there are one or multiple optimal input resolution,
which influences the receptive fields of different layers and leads to
different performance. Since we are currently all somehow limited by the
computation resources and time constraints. We usually set these
configurations based on our experiences. It is hard to give theoretical
explanations.
…On Wed, Jun 16, 2021 at 11:34 AM clelouch ***@***.***> wrote:
Thanks for your kind help.
It seems that the TPAMI version Basnet reports much better performance
compared with the CVPR version one. I guess the improvement can be
attributed to the larger input size. Am I right?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#58 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSGORKIHRBEFMLJA5OK7ILTTBHZTANCNFSM46YZGFGA>
.
--
Xuebin Qin
PhD
Department of Computing Science
University of Alberta, Edmonton, AB, Canada
Homepage:https://webdocs.cs.ualberta.ca/~xuebin/
|
I guess using larger images maintains more finer details while requires deeper network to obtain larger receptive field size. Consequently, we need a more powerful GPU to train the model. Maybe implement a much deeper network with group norm can solve the problem, as gn does not require large batch size. |
Thanks for your code and paper.
I notice that there is no validation set in the training stage. and the training process is stopeed when the loss converges. I am curious how to define the "converge" and avoid overfitting, since the loss may fluctuates.
The text was updated successfully, but these errors were encountered: