Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Open AccessPosted Content

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

- 22 Jul 2014 -

arXiv: Computer Vision and Pattern Recog...

TLDR

A new geocentric embedding is proposed for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity to facilitate the use of perception in fields like robotics.

Abstract:

In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.

Citations

PDF

Open Access

More filters

Posted Content

Fully Convolutional Networks for Semantic Segmentation

Jonathan Long, +2 more

- 14 Nov 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.

...read moreread less

Proceedings ArticleDOI

Adversarial Discriminative Domain Adaptation

Eric Tzeng, +3 more

TL;DR: Adversarial Discriminative Domain Adaptation (ADDA) as mentioned in this paper combines discriminative modeling, untied weight sharing, and a generative adversarial network (GAN) loss.

...read moreread less

Proceedings ArticleDOI

3D ShapeNets: A deep representation for volumetric shapes

Zhirong Wu, +6 more

TL;DR: This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.

...read moreread less

Journal ArticleDOI

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, +8 more

- 11 Feb 2016 -

IEEE Transactions on Medical Imaging

TL;DR: Two specific computer-aided detection problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification are studied, achieving the state-of-the-art performance on the mediastinal LN detection, and the first five-fold cross-validation classification results are reported.

...read moreread less

Journal ArticleDOI

Object Detection With Deep Learning: A Review

Zhong-Qiu Zhao, +3 more

- 28 Jan 2019 -

IEEE Transactions on Neural Networks

TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Random Forests

Leo Breiman

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Posted Content

Rich feature hierarchies for accurate object detection and semantic segmentation

Ross Girshick, +3 more

- 11 Nov 2013 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

...read moreread less