[Feature] Support BiSeNetV1 #851

MengzhangLI · 2021-09-03T07:49:14Z

Implementation of BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentationn.
Modified from the BiSeNetV1 PyTorch repository.

Check out inference metric.
Check out training metric.
Uploading models and logs with correct links

Future Works:
Results on more dataset, i.e., CamVid or COCO-Stuff.
Results on more backbones, i.e., Xception39, ResNet101_v1c.
Results on Non-real-time V.S. Real-time, i.e., different resized images as inputs.

Notice:
(1) This version of implementation is a little bit different from original version.
(2) Original paper has a dispute about effect of training tricks on numerical results. Here is diss and diss back.

MengzhangLI · 2021-09-03T07:52:22Z

Inference metric 75.37 has been aligned.

CoinCheung repo:

Our re-implementation:

codecov · 2021-09-03T07:59:59Z

Codecov Report

Merging #851 (79b00c0) into master (2aa632e) will increase coverage by 0.15%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #851      +/-   ##
==========================================
+ Coverage   89.62%   89.77%   +0.15%     
==========================================
  Files         113      114       +1     
  Lines        6263     6356      +93     
  Branches      989      995       +6     
==========================================
+ Hits         5613     5706      +93     
  Misses        452      452              
  Partials      198      198

Flag	Coverage Δ
unittests	`89.77% <100.00%> (+0.15%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmseg/models/backbones/__init__.py	`100.00% <100.00%> (ø)`
mmseg/models/backbones/bisenetv1.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2aa632e...79b00c0. Read the comment docs.

mmseg/models/backbones/bisenetv1.py

MengzhangLI · 2021-09-13T08:33:09Z

There seems to be some gap between the FPS of this model and the results in the paper? Can you give the value of FPS calculated by benchmark.py?

This PR is still experimental so I can't answer you. If you have achieved your FPS please leave out your result here, that must be very valuable.

yangshushuaige · 2021-09-13T08:53:14Z

There seems to be some gap between the FPS of this model and the results in the paper? Can you give the value of FPS calculated by benchmark.py?

This PR is still experimental so I can't answer you. If you has achieved your FPS please leave out your result here.

We use the Tesla P40 with fp32 to compute inference time. For the inference size of 640360, the FPS is 51.23; for the inference size of 1280720, the FPS is 30.96; for the inference size of 1920*1080, the FPS is 19.48. In the paper, the results based on Titan X are 129.4, 47.9 and 23 respectively. I am not sure if it is because the inference speed of P40 is significantly lower than that of Titan X. Similarly, P40-based speed is still incomparable compared to the results from CoinCheung.

MengzhangLI · 2021-09-13T14:23:24Z

There seems to be some gap between the FPS of this model and the results in the paper? Can you give the value of FPS calculated by benchmark.py?

This PR is still experimental so I can't answer you. If you has achieved your FPS please leave out your result here.

We use the Tesla P40 with fp32 to compute inference time. For the inference size of 640_360, the FPS is 51.23; for the inference size of 1280_720, the FPS is 30.96; for the inference size of 1920*1080, the FPS is 19.48. In the paper, the results based on Titan X are 129.4, 47.9 and 23 respectively. I am not sure if it is because the inference speed of P40 is significantly lower than that of Titan X. Similarly, P40-based speed is still incomparable compared to the results from CoinCheung.

FYI.

I used tensorrt 7.2.3 for inference, and I used input size of 1024x2048, my gpu is tesla T4. I originally used tensorrt 7.0.0, and the speed is a little bit slower than that of 7.2.3.(about 1fps). CoinCheung/BiSeNet#173

openmmlab-bot · 2021-09-15T01:52:36Z

Task linked: CU-1f0zyk7 bisenet v1

mmseg/models/backbones/bisenetv1.py

yangshushuaige · 2021-09-15T12:17:01Z

Hello, thanks for your great work. I have a question about BN and SyncBN. Why do you choose SyncBN for Backbone and FCN head and BN for other parts?

MengzhangLI · 2021-09-16T12:15:36Z

Hello, thanks for your great work. I have a question about BN and SyncBN. Why do you choose SyncBN for Backbone and FCN head and BN for other parts?

It should be SyncBN by default. You can find norm_cfg in config file, which would replace those BN in training.

MengzhangLI · 2021-09-16T12:19:40Z

There seems to be some gap between the FPS of this model and the results in the paper? Can you give the value of FPS calculated by benchmark.py?

This PR is still experimental so I can't answer you. If you has achieved your FPS please leave out your result here.

We use the Tesla P40 with fp32 to compute inference time. For the inference size of 640_360, the FPS is 51.23; for the inference size of 1280_720, the FPS is 30.96; for the inference size of 1920*1080, the FPS is 19.48. In the paper, the results based on Titan X are 129.4, 47.9 and 23 respectively. I am not sure if it is because the inference speed of P40 is significantly lower than that of Titan X. Similarly, P40-based speed is still incomparable compared to the results from CoinCheung.

Our benchmark.py is not strictly accurate, results achieving from it are not very stable. So I suggest you not use benchmark.py to test its real performance.

But we would support fair and accurate FPS performance especially with TensorRT(for real-time segmentation) in the near future.

yangshushuaige · 2021-09-16T13:02:17Z

Hello, thanks for your great work. I have a question about BN and SyncBN. Why do you choose SyncBN for Backbone and FCN head and BN for other parts?

It should be SyncBN by default. You can find norm_cfg in config file, which would replace those BN in training.

Thanks for your reply. I have some confusion. Although SyncBN is imported from config in BiseNet v1, when Context Path (line 310) and ffm (313) are defined, the corresponding SyncBN is not passed in, where the BN in the corresponding function is still used. I found this problem when I checked the logs, even though I got the result of 74.95 with both SyncBN in ResNet/FCN and BN in other parts. I want to know what went wrong. By the way, did you train with batch-size 4 with 4 gpus? Do I need to adjust lr when I use batch-size 8?

MengzhangLI · 2021-09-16T13:13:20Z

Hello, thanks for your great work. I have a question about BN and SyncBN. Why do you choose SyncBN for Backbone and FCN head and BN for other parts?

It should be SyncBN by default. You can find norm_cfg in config file, which would replace those BN in training.

Thanks for your reply. I have some confusion. Although SyncBN is imported from config in BiseNet v1, when Context Path (line 310) and ffm (313) are defined, the corresponding SyncBN is not passed in, where the BN in the corresponding function is still used. I found this problem when I checked the logs, even though I got the result of 74.95 with both SyncBN in ResNet/FCN and BN in other parts. I want to know what went wrong. By the way, did you train with batch-size 4 with 4 gpus? Do I need to adjust lr when I use batch-size 8?

They both inherit from BaseModule with init_cfg, for example super(ContextPath, self).__init__(init_cfg=init_cfg) (152) which would use SyncBN.

Learning rate could be larger if batch size become 2 times, but it is flexible I think, afterall we would better compare different settings.

yangshushuaige · 2021-09-16T13:18:55Z

Hello, thanks for your great work. I have a question about BN and SyncBN. Why do you choose SyncBN for Backbone and FCN head and BN for other parts?

It should be SyncBN by default. You can find norm_cfg in config file, which would replace those BN in training.

Thanks for your reply. I have some confusion. Although SyncBN is imported from config in BiseNet v1, when Context Path (line 310) and ffm (313) are defined, the corresponding SyncBN is not passed in, where the BN in the corresponding function is still used. I found this problem when I checked the logs, even though I got the result of 74.95 with both SyncBN in ResNet/FCN and BN in other parts. I want to know what went wrong. By the way, did you train with batch-size 4 with 4 gpus? Do I need to adjust lr when I use batch-size 8?

They both inherit from BaseModule with init_cfg, for example super(ContextPath, self).__init__(init_cfg=init_cfg) (152) which would use SyncBN.

Learning rate could be larger if batch size become 2 times, but it is flexible I think, afterall we would better compare different settings.

I use the old version of mmsegmentation, so I modify some parts of your code (such as BaseModule). Now I will modify it carefully, thank you for your help.

tests/test_models/test_backbones/test_bisenetv1.py

Junjun2016 · 2021-09-27T06:53:13Z

Hi @MengzhangLI
Please fix the conflicts.

…into BiSeNetV1

xvjiarui

LGTM

* First Commit * fix typos * fix typos * Fix assertion bug * Adding Assert * Adding Unittest * Fixing typo * Uploading models & logs * Fixing unittest error * changing README.md * changing README.md

First Commit

a44c172

MengzhangLI requested a review from Junjun2016 September 3, 2021 07:52

MengzhangLI added the WIP Work in process label Sep 3, 2021

MengzhangLI requested a review from xvjiarui September 3, 2021 09:14

fix typos

dfdb981

Junjun2016 reviewed Sep 3, 2021

View reviewed changes