SwinMLP-TranCAP: End-to-End Window-Based MLP Transformer Using Patches

This repository contains the reference code for the paper "Rethinking Surgical Captioning: End-to-End Window-Based MLP Transformer Using Patches (MICCAI 2022)"

If you find this repo useful, please cite our paper.

Environment setup

Python 3
PyTorch 1.3+ (along with torchvision)
cider (already been added as a submodule)
coco-caption (already been added as a submodule) (Remember to follow initialization steps in coco-caption/README.md)
yacs
lmdbdict

If you have difficulty running the training scripts in tools. You can try installing this repo as a python package:

python -m pip install -e .

Data preparation

DAISI Dataset

Since we are not allowed to release the dataset, please require dataset access from the DAISI Dataset Creator. The AI-Medic: an artificial intelligent mentor for trauma surgery. It is worth highlighting that we use the cleaned DAISI Dataset from the following work: Surgical Instruction Generation with Transformers

EndooVision18 Dataset

Please download images from endovissub2018-roboticscenesegmentation Please download the caption annotation from the CIDACaptioning.

Data preprocess

Please follow ImageCaptioning/data/README to implement the data preprocess.

Training procedure

Our code is build on top of ImageCaptioning. We add our model (Swin_TranCAP, SwinMLP_TranCAP, Video_Swin_TranCAP, and Video_SwinMLP_TranCAP) into their captioning/models/, and also add the related dataloader file.

Our training config files can be found in configs folder.

Please run

$ python tools/train_vision_transformer.py --cfg configs/daisi/transformer/SwinMLP_TranCAP_L.yml --id daisi_SwinMLP_TranCAP

Similary, you can run other models by using our provided configs files.

Acknowledgements

We thank the following repos providing helpful components/functions in our work. neuraltalk2, ImageCaptioning

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
captioning.egg-info		captioning.egg-info
captioning		captioning
configs		configs
figures		figures
projects		projects
scripts		scripts
test		test
tools		tools
vis		vis
.gitignore		.gitignore
ADVANCED.md		ADVANCED.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
parameters_flops_resnet18.py		parameters_flops_resnet18.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SwinMLP-TranCAP: End-to-End Window-Based MLP Transformer Using Patches

Environment setup

Data preparation

Data preprocess

Training procedure

Acknowledgements

About

Releases

Packages

Languages

License

XuMengyaAmy/SwinMLP_TranCAP

Folders and files

Latest commit

History

Repository files navigation

SwinMLP-TranCAP: End-to-End Window-Based MLP Transformer Using Patches

Environment setup

Data preparation

Data preprocess

Training procedure

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages