Synthetic RGB-D Fusion (SF) Mask R-CNN

Synthetic RGB-D Fusion (SF) Mask R-CNN for unseen object instance segmentation

S. Back, J. Kim, R. Kang, S. Choi and K. Lee. Segmenting unseen industrial components in a heavy clutter using rgb-d fusion and synthetic data. 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020. [Paper] [Video]

SF Mask R-CNN

Unseen object instance segmentation performance on WISDOM dataset

Method	Input	Use Synthetic Data	Backbone	mask AP	box AP	Reference
SD Mask R-CNN	Depth	Yes (WISDOM)	ResNet-35-FPN	51.6	-	Danielczuk et al.
Mask R-CNN	RGB	No	ResNet-35-FPN	38.4	-	Danielczuk et al.
Mask R-CNN	RGB	No	ResNet-50-FPN	40.1	36.7	Ito et al.
D-SOLO	RGB	No	ResNet-50-FPN	42.0	39.1	Ito et al.
PPIS	RGB	No	ResNet-50-FPN	52.3	48.1	Ito et al.
Mask R-CNN	RGB	Yes (Ours)	ResNet-50-FPN	59.0	61.4	Ours
Mask R-CNN	Depth	Yes (Ours)	ResNet-50-FPN	59.6	60.4	Ours
SF Mask R-CNN (early fusion)	RGB-Depth	Yes (Ours)	ResNet-50-FPN	55.5	57.2	Ours
SF Mask R-CNN (late fusion)	RGB-Depth	Yes (Ours)	ResNet-50-FPN	58.7	59.0	Ours
SF Mask R-CNN (confidence fusion)	RGB-Depth	Yes (Ours)	ResNet-50-FPN	60.5	61.0	Ours

SF Mask R-CNN is an upgraded version of RGB-D fusion Mask R-CNN with a confidence map estimator [1]. The main differences from [1] are

SF Mask R-CNN generates a self-attention map from RGB and inpainted depth (validity mask and raw depth were used in [1])
This self-attention map is used as a confidence map; Thus, RGB and depth feature maps fused with spatial self-attention in four different scales.
It was fined-tuned on WISDOM-REAL-Train (100 images) and evaluated on public unseen object instance segmentation dataset, WISDOM (The only custom industrial dataset was used previously)

Updates

SF Mask R-CNN has been released (2020/02/18)
Train dataset has been released (2022/05/16)

Getting Started

Environment Setup

Setup anaconda environment

$ conda create -n sfmaskrcnn python=3.7
$ conda activate sfmaskrcnn
$ pip install torch torchvision
$ pip install imgviz tqdm tensorboardX pandas opencv-python imutils pyfastnoisesimd scikit-image pycocotools
$ pip install pyrealsense2 # for demo
$ conda activate sfmaskrcnn

Download the provided SF Mask R-CNN weights pre-trained on our custom dataset.

Download the WISDOM-Real dataset [Link]
Set the path to the dataset and pretrained weights (You can put this into your bash profile)

$ export WISDOM_PATH={/path/to/the/wisdom-real/high-res/dataset}
$ export WEIGHT_PATH={/path/to/the/pretrained/weights}

Train

Download the synthetic train dataset at GDrive
Unzip the downloaded dataset, and modify the dataset_path of the config file correspondingly.
To train an SF Mask R-CNN (confidence fusion, RGB-noisy depth as input) on a synthetic dataset.

$ python train.py --gpu 0 --cfg rgb_noisydepth_confidencefusion

To fine-tune the SF Mask R-CNN on WISDOM dataset

$ python train.py --gpu 0 --cfg rgb_noisydepth_confidencefusion_FT --resume

Evaluation

To evaluate an SF Mask R-CNN (confidence fusion, RGB-noisy depth as input) on a WISDOM dataset

$ python eval.py --gpu 0 --cfg rgb_noisydepth_confidencefusion \
    --eval_data wisdom \
    --dataset_path $WISDOM_PATH \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_ConfidenceFusion.tar

Visualization

To visualize the inference results of SF Mask R-CNN on a WISDOM dataset

$ python inference.py --gpu 0 --cfg rgb_noisydepth_confidencefusion \
    --eval_data wisdom --vis_depth \
    --dataset_path $WISDOM_PATH \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_ConfidenceFusion.tar

Our custom synthetic dataset

$ python inference.py --gpu 0 --cfg rgb_noisydepth_confidencefusion \
    --eval_data synthetic --vis_depth \
    --dataset_path examples \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_ConfidenceFusion.tar

Demo with RealSense

To run real-time demo with realsense-d435

# SF Mask R-CNN (confidence fusion)
$ python demo.py --cfg rgb_noisydepth_confidencefusion \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_ConfidenceFusion.tar 

# SF Mask R-CNN (early fusion)
$ python demo.py --cfg rgb_noisydepth_earlyfusion \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_EarlyFusion.tar 


# SF Mask R-CNN (late fusion)
$ python demo.py --cfg rgb_noisydepth_latefusion \
    --weight_path $WEIGHT_PATH/SFMaskRCNN_LateFusion.tar

Authors

Seunghyeok Back seungback
Raeyoung Kang raeyo
Taewon Kim ailabktw
Joosoon Lee joosoon

Citation

If you use our work in a research project, please cite our work:

[1] @inproceedings{back2020segmenting,
  title={Segmenting unseen industrial components in a heavy clutter using rgb-d fusion and synthetic data},
  author={Back, Seunghyeok and Kim, Jongwon and Kang, Raeyoung and Choi, Seungjun and Lee, Kyoobin},
  booktitle={2020 IEEE International Conference on Image Processing (ICIP)},
  pages={828--832},
  year={2020},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
cfgs		cfgs
examples/train		examples/train
imgs		imgs
loader		loader
models		models
tools		tools
utils		utils
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
eval.py		eval.py
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthetic RGB-D Fusion (SF) Mask R-CNN

SF Mask R-CNN

Updates

Getting Started

Environment Setup

Train

Evaluation

Visualization

Demo with RealSense

Authors

Citation

About

Releases

Packages

Languages

gist-ailab/SF-Mask-RCNN

Folders and files

Latest commit

History

Repository files navigation

Synthetic RGB-D Fusion (SF) Mask R-CNN

SF Mask R-CNN

Updates

Getting Started

Environment Setup

Train

Evaluation

Visualization

Demo with RealSense

Authors

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages