scispace - formally typeset
Proceedings ArticleDOI

Data-Driven 3D Primitives for Single Image Understanding

TLDR
This work argues that these primitives should be both visually discriminative and geometrically informative and presents a technique for discovering such primitives and demonstrates the utility of their utility by using them to infer 3D surface normals given a single image.
Abstract
What primitives should we use to infer the rich 3D world behind an image? We argue that these primitives should be both visually discriminative and geometrically informative and we present a technique for discovering such primitives. We demonstrate the utility of our primitives by using them to infer 3D surface normals given a single image. Our technique substantially outperforms the state-of-the-art and shows improved cross-dataset performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.
Proceedings ArticleDOI

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

TL;DR: This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.
Proceedings ArticleDOI

Unsupervised Visual Representation Learning by Context Prediction

TL;DR: In this paper, the spatial context is used as a source of free and plentiful supervisory signal for training a rich visual representation, and the feature representation learned using this within-image context captures visual similarity across images.
Proceedings ArticleDOI

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture

TL;DR: This paper addresses three different computer vision tasks using a single basic architecture: depth prediction, surface normal estimation, and semantic labeling using a multiscale convolutional network that is able to adapt easily to each task using only small modifications.
Proceedings ArticleDOI

SUN RGB-D: A RGB-D scene understanding benchmark suite

TL;DR: This paper introduces an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks, and presents a dataset that enables the train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.
Journal ArticleDOI

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Related Papers (5)