Unable to reproduce the values in the paper #8

subeeshvasu · 2019-08-29T16:22:12Z

Hello,

I run the code "train_dise_gta2city.py" following the procedure explained in this project page. The only change I have done was to keep the batch_size as 1, to reduce the memory requirement. I got 38.4% mIoU on val set. This is a big difference as compared to the value of 45.4% reported in the paper. Can you please help me to understand the potential reasons behind this performance drop.

Some of the possible reasons which I could guess are the following.

The mIoU scores are computed at a resolution of 512 x 1024, while the original images are of size 1024 x 2048. In the paper, the resolution used to report the values are not mentioned. May be authors have reported the values at a resolution of 1024 x 2048? Just to check if this is the reason, I used the pretrained weights provided by the authors and got a score of 44.2% for images at resolution 512 x 1024. Therefore, I am assuming that the resolution of test images is not the reason behind performance drop
As per the paper, authors have used pretrained weights from PASCAL VOC dataset to initialise the encoder. This can also be the reason behind performance drop. However, even when I start the training scheme with the pretrained weights provided by the authors, the performance goes down eventually and will start to fluctuate around 38-39%.

Has anyone succeeded to get values around 44 % up on experimentation with this code?

Regards,
Subeesh

hui-po-wang · 2019-08-29T19:48:40Z

Hi,

Unfortunately, the training under the pytorch framework is non-deterministic. A relevant issue is here: https://discuss.pytorch.org/t/random-seed-initialization/7854/18 Even though we re-run the code, there still exists some fluctuation, but it is not difficult to get a value higher than 44% (with batch size = 2).

To your guess,

During the final testing, we use 1024x2048. As you tested, we think it doesn't affect too much.
Our apologies. That's a typo. Following the setting of Tsai et al., we use exactly the same initial weights as Deeplabv2 used, so the code is correct. A relevant issue is here: how is this deeplabv2 weight obtained? wasidennis/AdaptSegNet#5

The potential reason could be the version of libraries. I am re-running my code with batch size = 1. I hope I can come up with some good results and random seed to help you reproduce the performance.

--- edit ---
In total, the model will be trained for 250000 steps. The smaller batch size you use, the less samples the model sees (i.e. you only see half of the number of data as compared to mine). It could be helpful to train for more steps and adjust the learning rate moderately.

Cheers,
Hui-Po

subeeshvasu · 2019-08-29T20:12:13Z

Thank you for all the suggestions. I will try them out and see if I can improve the values.Please let me know if you are able to get to the value of 44% with batch size = 1.

crazygirl1992 · 2019-10-12T02:40:34Z

hello,i run the code,but my device is two 1080GPU,i want to know that what are your device? and how long you cost to train the model? @subeeshvasu

subeeshvasu · 2019-10-12T11:00:14Z

@crazygirl1992 I was using a single GTX TitanX (12 GB). With this settings, for batch size = 1, training cost was approximately: 2hrs, 20 minutes per 1000 iterations.

crazygirl1992 · 2019-10-12T11:46:17Z

thank you very much,and can you achive the paper's result now? the training almost 250000 iterations in his paper,and 20*250mins,not 2hrs

subeeshvasu · 2019-10-12T11:51:47Z

I couldn't get those values. With batch size = 4, one could reproduce the values I guess!.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce the values in the paper #8

Unable to reproduce the values in the paper #8

subeeshvasu commented Aug 29, 2019

hui-po-wang commented Aug 29, 2019 •

edited

Loading

subeeshvasu commented Aug 29, 2019

crazygirl1992 commented Oct 12, 2019

subeeshvasu commented Oct 12, 2019

crazygirl1992 commented Oct 12, 2019

subeeshvasu commented Oct 12, 2019

Unable to reproduce the values in the paper #8

Unable to reproduce the values in the paper #8

Comments

subeeshvasu commented Aug 29, 2019

hui-po-wang commented Aug 29, 2019 • edited Loading

subeeshvasu commented Aug 29, 2019

crazygirl1992 commented Oct 12, 2019

subeeshvasu commented Oct 12, 2019

crazygirl1992 commented Oct 12, 2019

subeeshvasu commented Oct 12, 2019

hui-po-wang commented Aug 29, 2019 •

edited

Loading