Skip to content

Commit

Permalink
update segmentation code and models
Browse files Browse the repository at this point in the history
  • Loading branch information
LightDXY committed Oct 15, 2021
1 parent d8be74a commit 9918cea
Show file tree
Hide file tree
Showing 9 changed files with 1,160 additions and 4 deletions.
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cswin-transformer-a-general-vision/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=cswin-transformer-a-general-vision)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cswin-transformer-a-general-vision/semantic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=cswin-transformer-a-general-vision)

This repo is the official implementation of ["CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"](https://arxiv.org/pdf/2107.00652.pdf). The code and models for downstream tasks are coming soon.
This repo is the official implementation of ["CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"](https://arxiv.org/pdf/2107.00652.pdf).

## Introduction

Expand Down Expand Up @@ -45,12 +45,13 @@ CSWin Transformer achieves strong performance on ImageNet classification (87.5 o
| CSwin-T | Semantic FPN | ImageNet-1K | 512x512 | 80K | 48.2 | - | 26M | 202G |
| CSwin-S | Semantic FPN | ImageNet-1K | 512x512 | 80K | 49.2 | - | 39M | 271G |
| CSwin-B | Semantic FPN | ImageNet-1K | 512x512 | 80K | 49.9 | - | 81M | 464G |
| CSwin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.4 | 60M | 959G |
| CSwin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.0 | 50.8 | 65M | 1027G |
| CSwin-B | UperNet | ImageNet-1K | 512x512 | 160K | 50.8 | 51.7 | 109M | 1222G |
| CSwin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.7 | 60M | 959G |
| CSwin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.4 | 51.5 | 65M | 1027G |
| CSwin-B | UperNet | ImageNet-1K | 512x512 | 160K | 51.1 | 52.2 | 109M | 1222G |
| CSwin-B | UPerNet | ImageNet-22K | 640x640 | 160K | 51.8 | 52.6 | 109M | 1941G |
| CSwin-L | UperNet | ImageNet-22K | 640x640 | 160K | 53.4 | 55.7 | 208M | 2745G |

pretrained models and code could be found at [`segmentation`](segmentation)

## Requirements

Expand Down
79 changes: 79 additions & 0 deletions segmentation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# ADE20k Semantic segmentation with CSWin


## Results and Models

| Backbone | Method | pretrain | Crop Size | Lr Schd | mIoU | mIoU (ms+flip) | #params | FLOPs | config | model | log |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| CSWin-T | UPerNet | ImageNet-1K | 512x512 | 160K | 49.3 | 50.7 | 60M | 959G | [`config`](configs/cswin/upernet_cswin_tiny.py) | [model]() | [log]() |
| CSWin-S | UperNet | ImageNet-1K | 512x512 | 160K | 50.4 | 51.5 | 65M | 1027G | [`config`](configs/cswin/upernet_cswin_small.py) |[model]() | [log]() |
| CSWin-B | UperNet | ImageNet-1K | 512x512 | 160K | 51.1 | 52.2 | 109M | 1222G | [`config`](configs/cswin/upernet_cswin_base.py) |[model]() | [log]() |


## Getting started

1. Install the [Swin_Segmentation](https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation) repository and some required packages.

```bash
git clone https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation
bash install_req.sh
```

2. Move the CSWin configs and backbone file to the corresponding folder.

```bash
cp -r configs/cswin <MMSEG_PATH>/configs/
cp config/_base/upernet_cswin.py <MMSEG_PATH>/config/_base_/models
cp backbone/cswin_transformer.py <MMSEG_PATH>/mmseg/models/backbones/
cp mmcv_custom/checkpoint.py <MMSEG_PATH>/mmcv_custom/
```

3. Install [apex](https://github.com/NVIDIA/apex) for mixed-precision training

```bash
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
```

4. Follow the guide in [mmseg](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/dataset_prepare.md) to prepare the ADE20k dataset.

## Fine-tuning

Command format:
```
tools/dist_train.sh <CONFIG_PATH> <NUM_GPUS> --options model.pretrained=<PRETRAIN_MODEL_PATH>
```

For example, using a CSWin-T backbone with UperNet:
```bash
bash tools/dist_train.sh \
configs/cswin/upernet_cswin_tiny.py 8 \
--options model.pretrained=<PRETRAIN_MODEL_PATH>
```

pretrained models could be found at [main page](https://github.com/microsoft/CSWin-Transformer).

More config files can be found at [`configs/cswin`](configs/cswin).


## Evaluation

Command format:
```
tools/dist_test.sh <CONFIG_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval mIoU
tools/dist_test.sh <CONFIG_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval mIoU --aug-test
```

For example, evaluate a CSWin-T backbone with UperNet:
```bash
bash tools/dist_test.sh configs/cswin/upernet_cswin_tiny.py \
<CHECKPOINT_PATH> 8 --eval mIoU
```


---

## Acknowledgment

This code is built using the [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) library, [Timm](https://github.com/rwightman/pytorch-image-models) library, the [Swin](https://github.com/microsoft/Swin-Transformer) repository.
Loading

0 comments on commit 9918cea

Please sign in to comment.