Welcome to the official repository for the method presented in "LAVT: Language-Aware Vision Transformer for Referring Image Segmentation."
제조환경 데이터 학습 코드
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 train.py --model lavt_one_xlm --dataset aihub_manufact_80 --model_id refcoco_manufact_80_uniq_id --batch-size 4 --lr 0.00005 --wd 1e-2 --swin_type base --pretrained_swin_weights ./pretrained_weights/swin_base_patch4_window12_384_22k.pth --epochs 40 --img_size 480 2>&1 | tee ./models/refcoco_manufact_80_uniq_id/output
제조환경 데이터 테스트 코드
python test.py --model lavt_one_xlm --swin_type base --dataset aihub_manufact_80 --split test --resume ./checkpoints/model_best_refcoco_manufact_80_uniq_id.pth --workers 4 --ddp_trained_weights --window12 --img_size 480
@inproceedings{yang2022lavt,
title={LAVT: Language-Aware Vision Transformer for Referring Image Segmentation},
author={Yang, Zhao and Wang, Jiaqi and Tang, Yansong and Chen, Kai and Zhao, Hengshuang and Torr, Philip HS},
booktitle={CVPR},
year={2022}
}
We appreciate all contributions. It helps the project if you could
- report issues you are facing,
- give a 👍 on issues reported by others that are relevant to you,
- answer issues reported by others for which you have found solutions,
- and implement helpful new features or improve the code otherwise with pull requests.
Code in this repository is built upon several public repositories. Specifically,
- data pre-processing leverages the refer repository,
- the backbone model is implemented based on code from Swin Transformer for Semantic Segmentation,
- the training and testing pipelines are adapted from RefVOS,
- and implementation of the BERT model (files in the bert directory) is from Hugging Face Transformers v3.0.2 (we migrated over the relevant code to fix a bug and simplify the installation process).
Some of these repositories in turn adapt code from OpenMMLab and TorchVision. We'd like to thank the authors/organizations of these repositories for open sourcing their projects.
GNU GPLv3