update segmentation code and models

microsoft · Oct 15, 2021 · 9918cea · 9918cea
1 parent d8be74a
commit 9918cea
Show file tree

Hide file tree

Showing 9 changed files with 1,160 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cswin-transformer-a-general-vision/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=cswin-transformer-a-general-vision)
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cswin-transformer-a-general-vision/semantic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=cswin-transformer-a-general-vision)
 
-This repo is the official implementation of ["CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"](https://arxiv.org/pdf/2107.00652.pdf). The code and models for downstream tasks are coming soon.
+This repo is the official implementation of ["CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"](https://arxiv.org/pdf/2107.00652.pdf). 
 
 ## Introduction
 
@@ -45,12 +45,13 @@ CSWin Transformer achieves strong performance on ImageNet classification (87.5 o
 | CSwin-T | Semantic FPN | ImageNet-1K | 512x512 | 80K | 48.2 | - | 26M | 202G |
 | CSwin-S | Semantic FPN | ImageNet-1K | 512x512 | 80K | 49.2 | - | 39M | 271G |
 | CSwin-B | Semantic FPN | ImageNet-1K | 512x512 | 80K | 49.9 | - | 81M | 464G |
-| CSwin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.4 | 60M | 959G |
-| CSwin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.0 | 50.8 | 65M | 1027G |
-| CSwin-B | UperNet | ImageNet-1K | 512x512 | 160K | 50.8 | 51.7 | 109M | 1222G |
+| CSwin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.7 | 60M | 959G |
+| CSwin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.4 | 51.5 | 65M | 1027G |
+| CSwin-B | UperNet | ImageNet-1K | 512x512 | 160K | 51.1 | 52.2 | 109M | 1222G |
 | CSwin-B | UPerNet | ImageNet-22K | 640x640 | 160K | 51.8 | 52.6 | 109M | 1941G |
 | CSwin-L | UperNet | ImageNet-22K | 640x640 | 160K | 53.4 | 55.7 | 208M | 2745G |
 
+pretrained models and code could be found at [`segmentation`](segmentation)
 
 ## Requirements
 

diff --git a/segmentation/README.md b/segmentation/README.md
@@ -0,0 +1,79 @@
+# ADE20k Semantic segmentation with CSWin
+
+
+## Results and Models
+
+| Backbone | Method | pretrain | Crop Size | Lr Schd | mIoU | mIoU (ms+flip) | #params | FLOPs | config | model | log |
+| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |  :---: | :---: | :---: |
+| CSWin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.7 | 60M | 959G | [`config`](configs/cswin/upernet_cswin_tiny.py) | [model]() | [log]() |
+| CSWin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.4 | 51.5 | 65M | 1027G |  [`config`](configs/cswin/upernet_cswin_small.py) |[model]() | [log]() |
+| CSWin-B | UperNet | ImageNet-1K | 512x512 | 160K | 51.1 | 52.2 | 109M | 1222G | [`config`](configs/cswin/upernet_cswin_base.py) |[model]() | [log]() |
+
+
+## Getting started 
+
+1. Install the [Swin_Segmentation](https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation) repository and some required packages.
+
+```bash
+git clone https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation
+bash install_req.sh
+```
+
+2. Move the CSWin configs and backbone file to the corresponding folder.
+
+```bash
+cp -r configs/cswin <MMSEG_PATH>/configs/
+cp config/_base/upernet_cswin.py <MMSEG_PATH>/config/_base_/models
+cp backbone/cswin_transformer.py <MMSEG_PATH>/mmseg/models/backbones/
+cp mmcv_custom/checkpoint.py <MMSEG_PATH>/mmcv_custom/
+```
+
+3. Install [apex](https://github.com/NVIDIA/apex) for mixed-precision training
+
+```bash
+git clone https://github.com/NVIDIA/apex
+cd apex
+pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
+```
+
+4. Follow the guide in [mmseg](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md) to prepare the ADE20k dataset.
+
+## Fine-tuning
+
+Command format:
+```
+tools/dist_train.sh <CONFIG_PATH> <NUM_GPUS> --options model.pretrained=<PRETRAIN_MODEL_PATH>
+```
+
+For example, using a CSWin-T backbone with UperNet:
+```bash
+bash tools/dist_train.sh \
+    configs/cswin/upernet_cswin_tiny.py 8 \
+    --options model.pretrained=<PRETRAIN_MODEL_PATH>
+```
+
+pretrained models could be found at [main page](https://github.com/microsoft/CSWin-Transformer).
+
+More config files can be found at [`configs/cswin`](configs/cswin).
+
+
+## Evaluation
+
+Command format:
+```
+tools/dist_test.sh  <CONFIG_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval mIoU
+tools/dist_test.sh  <CONFIG_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval mIoU --aug-test
+```
+
+For example, evaluate a CSWin-T backbone with UperNet:
+```bash
+bash tools/dist_test.sh configs/cswin/upernet_cswin_tiny.py \ 
+    <CHECKPOINT_PATH> 8 --eval mIoU
+```
+
+
+---
+
+## Acknowledgment 
+
+This code is built using the [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) library, [Timm](https://github.com/rwightman/pytorch-image-models) library, the [Swin](https://github.com/microsoft/Swin-Transformer) repository.