scispace - formally typeset
Open AccessProceedings ArticleDOI

Spatial Memory for Context Reasoning in Object Detection

Reads0
Chats0
TLDR
Spatial Memory Network (SMN) as mentioned in this paper assembles object instances back into a pseudo-image representation that is easy to be fed into another ConvNet for object-object context reasoning.
Abstract
Modeling instance-level context and object-object relationships is extremely challenging. It requires reasoning about bounding boxes of different classes, locations etc. Above all, instance-level spatial reasoning inherently requires modeling conditional distributions on previous detections. Unfortunately, our current object detection systems do not have any memory to remember what to condition on! The state-of-the-art object detectors still detect all object in parallel followed by non-maximal suppression (NMS). While memory has been used for tasks such as captioning, they mostly use image-level memory cells without capturing the spatial layout. On the other hand, modeling object-object relationships requires spatial reasoning – not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns. This paper presents a conceptually simple yet powerful solution – Spatial Memory Network (SMN), to model the instance-level context efficiently and effectively. Our spatial memory essentially assembles object instances back into a pseudo “image” representation that is easy to be fed into another ConvNet for object-object context reasoning. This leads to a new sequential reasoning architecture where image and memory are processed in parallel to obtain detections which update the memory again. We show our SMN direction is promising as it provides 2.2% improvement over baseline Faster RCNN on the COCO dataset with VGG161.

read more

Citations
More filters
Journal ArticleDOI

Deep Learning for Generic Object Detection: A Survey

TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Proceedings ArticleDOI

Relation Networks for Object Detection

TL;DR: In this article, the authors propose an object relation module to model relations between objects, which is shown effective on improving object recognition and duplicate removal steps in the modern object detection pipeline.
Journal ArticleDOI

Recent Advances in Deep Learning for Object Detection

TL;DR: A comprehensive survey of recent advances in visual object detection with deep learning can be found in this article, where the authors systematically analyze the existing object detection frameworks and organize the survey into three major parts: detection components, learning strategies, and applications and benchmarks.
Posted Content

Relation Networks for Object Detection

TL;DR: An object relation module is proposed that processes a set of objects simultaneously through interaction between their appearance feature and geometry, thus allowing modeling of their relations, which gives rise to the first fully end-to-end object detector.
Journal ArticleDOI

Recent advances in small object detection based on deep learning: A review

TL;DR: This work comprehensively review the existing small object detection methods based on deep learning from five aspects, including multi-scale feature learning, data augmentation, training strategy, context-based detection and GAN- based detection.
References
More filters
Proceedings ArticleDOI

Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection

TL;DR: Zhang et al. as discussed by the authors proposed a deep Variation-structured Re-inforcement Learning (VRL) framework to sequentially discover object relationships and attributes in the whole image.
Proceedings ArticleDOI

Reinforcement Learning for Visual Object Detection

TL;DR: This paper presents principled sequential models that accumulate evidence collected at a small set of image locations in order to detect visual objects effectively, formulating sequential search as reinforcement learning of the search policy (including the stopping condition).
Book ChapterDOI

Contextual Priming and Feedback for Faster R-CNN

TL;DR: This paper proposes to augment Faster R-CNN with a semantic segmentation network, and uses segmentation to provide top-down iterative feedback using two stage training, and results indicate that all three contributions improve the performance on object detection, semantic segmentsation and region proposal generation.
Proceedings ArticleDOI

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

TL;DR: AttentionNet is presented, a novel detection method using a deep convolutional neural network, named AttentionNet, which detects objects without any separated models from the object proposal to the post bounding-box regression.
Proceedings ArticleDOI

Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships

TL;DR: An exemplar-based model of objects and their relationships is presented, the Visual Memex, that encodes both local appearance and 2D spatial context between object instances and may be the critical missing ingredient in scene understanding systems.
Related Papers (5)