Skip to content

Latest commit

 

History

History
73 lines (55 loc) · 15.7 KB

README.md

File metadata and controls

73 lines (55 loc) · 15.7 KB

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Abstract

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks.

Introduction

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

We provide config files to reproduce the results in the paper for "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection.

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citation

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Results and models

The results on COCO 2017val are shown in the below table.

Backbone Model Context Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R-50-FPN Mask GC(c3-c5, r16) 1x 5.0 39.7 35.9 config model | log
R-50-FPN Mask GC(c3-c5, r4) 1x 5.1 15.0 39.9 36.0 config model | log
R-101-FPN Mask GC(c3-c5, r16) 1x 7.6 11.4 41.3 37.2 config model | log
R-101-FPN Mask GC(c3-c5, r4) 1x 7.8 11.6 42.2 37.8 config model | log
Backbone Model Context Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R-50-FPN Mask - 1x 4.4 16.6 38.4 34.6 config model | log
R-50-FPN Mask GC(c3-c5, r16) 1x 5.0 15.5 40.4 36.2 config model | log
R-50-FPN Mask GC(c3-c5, r4) 1x 5.1 15.1 40.7 36.5 config model | log
R-101-FPN Mask - 1x 6.4 13.3 40.5 36.3 config model | log
R-101-FPN Mask GC(c3-c5, r16) 1x 7.6 12.0 42.2 37.8 config model | log
R-101-FPN Mask GC(c3-c5, r4) 1x 7.8 11.8 42.2 37.8 config model | log
X-101-FPN Mask - 1x 7.6 11.3 42.4 37.7 config model | log
X-101-FPN Mask GC(c3-c5, r16) 1x 8.8 9.8 43.5 38.6 config model | log
X-101-FPN Mask GC(c3-c5, r4) 1x 9.0 9.7 43.9 39.0 config model | log
X-101-FPN Cascade Mask - 1x 9.2 8.4 44.7 38.6 config model | log
X-101-FPN Cascade Mask GC(c3-c5, r16) 1x 10.3 7.7 46.2 39.7 config model | log
X-101-FPN Cascade Mask GC(c3-c5, r4) 1x 10.6 46.4 40.1 config model | log
X-101-FPN DCN Cascade Mask - 1x 47.5 40.9 config model | log
X-101-FPN DCN Cascade Mask GC(c3-c5, r16) 1x 48.0 41.3 config model | log
X-101-FPN DCN Cascade Mask GC(c3-c5, r4) 1x 47.9 41.1 config model | log

Notes:

  • The SyncBN is added in the backbone for all models in Table 2.
  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
  • r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.