This repository is an official implementation of the CVPR 2022 paper "DESTR: Object Detection with Split Transformer".
Split Cross-attention | Pipeline (insert miniDet) | Pair Attention |
---|---|---|
Contributions:
- Split estimation of cross attention into two independent branches: one tailored for classification and the other for box regression;
- Insert a mini-detector between encoder and decoder to initialize objects’ classification, regression and positional embeddings;
- Augment self-attention in decoder to pair self-attention for every two pairs of spatially adjacent queries to improve inductive bias.
We provide conditional DETR and conditional DETR-DC5 models. AP is computed on COCO 2017 val.
Method | Epochs | Params (M) | AP | APS | APM | APL | URL |
---|---|---|---|---|---|---|---|
DETR-R50 | 500 | 41 | 42.0 | 20.5 | 45.8 | 61.1 | model log |
DETR-R50 | 50 | 41 | 34.8 | 13.9 | 37.3 | 54.4 | model log |
Conditional DETR-R50 | 50 | 44 | 41.0 | 20.6 | 44.3 | 59.3 | model log |
DESTR-R50 | 50 | 69 | 43.6 | 23.5 | 47.6 | 62.4 | model log |
Note:
- The numbers in the table are slightly differently from the numbers in the paper. We re-ran some experiments when releasing the codes.
- More weights will be release in future
Please see Conditional DETR
DESTR is released under the Apache 2.0 license. Please see the LICENSE file for more information.
DESTR is build on Conditional DETR . We appreciate the contributions from them!
@inproceedings{he2022destr,
title={DESTR: Object Detection with Split Transformer},
author={He, Liqiang and Todorovic, Sinisa},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={9377--9386},
year={2022}
}