Number of iterations in main training #20

hkchengrex · 2020-06-10T07:26:37Z

Hi, thanks for your code and work.

I read on another issue #6 that the main training runs for 260 epochs with 3771 samples per epoch. That should be 260*3771/4(batch size) ~ 240K iterations while pretraining runs for 2M iterations. Why would it take just 4 days for pretraining but 3 days for main training as mentioned in the paper, given that each iteration should approximately take the same amount of time?

Am I missing something? I am trying to re-train the network but 260 epochs seem insufficient. Thanks a lot!

seoungwugoh · 2020-06-19T02:13:36Z

Hi @hkchengrex, thanks for pointing out my mistake in the previous answer.
In fact, pre-train runs for 2M samples not iterations. So it is about 500K iterations as a batch. In paper, we roughly report the training time without an accurate measuring of time. If it cause you misunderstanding, I am sorry about that. You are right that pretrain take twice more time than fine-tuning. In our implementation, 260 epochs for FT is sufficient as we regularly reduce LR.

hkchengrex · 2020-06-19T03:35:08Z

Thank you for the explanation!

hkchengrex closed this as completed Jun 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of iterations in main training #20

Number of iterations in main training #20

hkchengrex commented Jun 10, 2020

seoungwugoh commented Jun 19, 2020

hkchengrex commented Jun 19, 2020

Number of iterations in main training #20

Number of iterations in main training #20

Comments

hkchengrex commented Jun 10, 2020

seoungwugoh commented Jun 19, 2020

hkchengrex commented Jun 19, 2020