Data-Driven 3D Primitives for Single Image Understanding

doi:10.1109/ICCV.2013.421

Proceedings ArticleDOI

Data-Driven 3D Primitives for Single Image Understanding

- pp 3392-3399

TLDR

This work argues that these primitives should be both visually discriminative and geometrically informative and presents a technique for discovering such primitives and demonstrates the utility of their utility by using them to infer 3D surface normals given a single image.

Abstract:

What primitives should we use to infer the rich 3D world behind an image? We argue that these primitives should be both visually discriminative and geometrically informative and we present a technique for discovering such primitives. We demonstrate the utility of our primitives by using them to infer 3D surface normals given a single image. Our technique substantially outperforms the state-of-the-art and shows improved cross-dataset performance.

Citations

PDF

Open Access

More filters

Proceedings Article

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

David Eigen, +2 more

TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.

...read moreread less

Proceedings ArticleDOI

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

Angela Dai, +5 more

TL;DR: This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.

...read moreread less

Proceedings ArticleDOI

Unsupervised Visual Representation Learning by Context Prediction

Carl Doersch, +2 more

TL;DR: In this paper, the spatial context is used as a source of free and plentiful supervisory signal for training a rich visual representation, and the feature representation learned using this within-image context captures visual similarity across images.

...read moreread less

Proceedings ArticleDOI

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture

David Eigen, +1 more

TL;DR: This paper addresses three different computer vision tasks using a single basic architecture: depth prediction, surface normal estimation, and semantic labeling using a multiscale convolutional network that is able to adapt easily to each task using only small modifications.

...read moreread less

Proceedings ArticleDOI

SUN RGB-D: A RGB-D scene understanding benchmark suite

Shuran Song, +2 more

TL;DR: This paper introduces an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks, and presents a dataset that enables the train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb, +3 more

- 01 Sep 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

Journal ArticleDOI

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

Radhakrishna Achanta, +5 more

- 01 Nov 2012 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

...read moreread less

Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Daniel Scharstein, +2 more

- 09 Dec 2001 -

International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Collapse

IEEE Transactions on Pattern Analysis an...

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

David Eigen, +2 more

Data-Driven 3D Primitives for Single Image Understanding

Citations

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

Unsupervised Visual Representation Learning by Context Prediction

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture

SUN RGB-D: A RGB-D scene understanding benchmark suite

References

Distinctive Image Features from Scale-Invariant Keypoints

Histograms of oriented gradients for human detection

Object Detection with Discriminatively Trained Part-Based Models

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Related Papers (5)

Indoor segmentation and support inference from RGBD images

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture

ImageNet Classification with Deep Convolutional Neural Networks

Make3D: Learning 3D Scene Structure from a Single Still Image

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network