Spatial Memory for Context Reasoning in Object Detection

doi:10.1109/ICCV.2017.440

Open AccessProceedings ArticleDOI

Spatial Memory for Context Reasoning in Object Detection

Xinlei Chen, +1 more

- pp 4106-4116

Chats0

TLDR

Spatial Memory Network (SMN) as mentioned in this paper assembles object instances back into a pseudo-image representation that is easy to be fed into another ConvNet for object-object context reasoning.

Abstract:

Modeling instance-level context and object-object relationships is extremely challenging. It requires reasoning about bounding boxes of different classes, locations etc. Above all, instance-level spatial reasoning inherently requires modeling conditional distributions on previous detections. Unfortunately, our current object detection systems do not have any memory to remember what to condition on! The state-of-the-art object detectors still detect all object in parallel followed by non-maximal suppression (NMS). While memory has been used for tasks such as captioning, they mostly use image-level memory cells without capturing the spatial layout. On the other hand, modeling object-object relationships requires spatial reasoning – not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns. This paper presents a conceptually simple yet powerful solution – Spatial Memory Network (SMN), to model the instance-level context efficiently and effectively. Our spatial memory essentially assembles object instances back into a pseudo “image” representation that is easy to be fed into another ConvNet for object-object context reasoning. This leads to a new sequential reasoning architecture where image and memory are processed in parallel to obtain detections which update the memory again. We show our SMN direction is promising as it provides 2.2% improvement over baseline Faster RCNN on the COCO dataset with VGG161.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review

Ayoub Benali Amjoud, +1 more

IEEE Access

TL;DR: In this paper , a literature review on various state-of-the-art object detection algorithms and the underlying concepts behind these methods is presented, which classify them into three main groups: anchor-based, anchor-free, and transformer-based detectors.

...read moreread less

Journal ArticleDOI

Object detection based on knowledge graph network

Jianping Li, +4 more

- 08 Nov 2022 -

Applied Intelligence

Book ChapterDOI

Boosting the Performance of Object Detection CNNs with Context-Based Anomaly Detection

Jan Blaha, +2 more

TL;DR: In this article, an autoencoder network is used to detect anomalous objects in images, i.e. objects that do not fit with the rest of the current observations in the scene.

...read moreread less

Book ChapterDOI

Video Activity Recognition Based on Objects Detection Using Recurrent Neural Networks

Mounir Boudmagh, +2 more

TL;DR: In this paper, an end-to-end multitask model that jointly learns object-action relationships was proposed to detect visual relationships between objects to recognize activities in videos of the MSR Daily Activity Dataset.

...read moreread less

Posted Content

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

Hang Xu, +4 more

- 18 Feb 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Wang et al. as discussed by the authors proposed a universal object detector called Universal-RCNN that incorporates graph transfer learning for propagating relevant semantic information across multiple datasets to reach semantic coherency.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Collapse

International Journal of Computer Vision

Spatial Memory for Context Reasoning in Object Detection

Citations

Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review

Object detection based on knowledge graph network

Boosting the Performance of Object Detection CNNs with Context-Based Anomaly Detection

Video Activity Recognition Based on Objects Detection Using Recurrent Neural Networks

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

References

Deep Residual Learning for Image Recognition

Long short-term memory

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Large Scale Visual Recognition Challenge

Microsoft COCO: Common Objects in Context

Related Papers (5)

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

SSD: Single Shot MultiBox Detector

Feature Pyramid Networks for Object Detection

The Pascal Visual Object Classes (VOC) Challenge