Attentive Contexts for Object Detection

doi:10.1109/TMM.2016.2642789

Journal Article•DOI•

Attentive Contexts for Object Detection

Jianan Li¹, Yunchao Wei², Xiaodan Liang³, Jian Dong⁴, Tingfa Xu¹, Jiashi Feng⁴, Shuicheng Yan¹ - Show less +3 more•Institutions (4)

Beijing Institute of Technology¹, Beijing Jiaotong University², Sun Yat-sen University³, National University of Singapore⁴

01 May 2017-IEEE Transactions on Multimedia (IEEE)-Vol. 19, Iss: 5, pp 944-954

TL;DR: Zhang et al. as discussed by the authors proposed an attention-to-context convolution neural network (AC-CNN) for object detection, which consists of one attention-based global contextualized subnetwork and one multi-scale local contextualized (MLC) subnetwork.

read less

Abstract: Modern deep neural network-based object detection methods typically classify candidate proposals using their interior features. However, global and local surrounding contexts that are believed to be valuable for object detection are not fully exploited by existing methods yet. In this work, we take a step towards understanding what is a robust practice to extract and utilize contextual information to facilitate object detection in practice. Specifically, we consider the following two questions: “how to identify useful global contextual information for detecting a certain object?” and “how to exploit local context surrounding a proposal for better inferring its contents?” We provide preliminary answers to these questions through developing a novel attention to context convolution neural network (AC-CNN)-based object detection model. AC-CNN effectively incorporates global and local contextual information into the region-based CNN (e.g., fast R-CNN and faster R-CNN) detection framework and provides better object detection performance. It consists of one attention-based global contextualized (AGC) subnetwork and one multi-scale local contextualized (MLC) subnetwork. To capture global context, the AGC subnetwork recurrently generates an attention map for an input image to highlight useful global contextual locations, through multiple stacked long short-term memory layers. For capturing surrounding local context, the MLC subnetwork exploits both the inside and outside contextual information of each specific proposal at multiple scales. The global and local context are then fused together for making the final decision for detection. Extensive experiments on PASCAL VOC 2007 and VOC 2012 well demonstrate the superiority of the proposed AC-CNN over well-established baselines.

...read moreread less

Attentive Contexts for Object Detection

Citations

References

Related Papers (5)