scispace - formally typeset
Search or ask a question

Showing papers by "Andrew Rabinovich published in 2016"


Posted Content
TL;DR: Two convolutional neural network architectures are presented for HomographyNet: a regression network which directly estimates the real-valued homography parameters, and a classification network which produces a distribution over quantized homographies.
Abstract: We present a deep convolutional neural network for estimating the relative homography between a pair of images. Our feed-forward network has 10 layers, takes two stacked grayscale images as input, and produces an 8 degree of freedom homography which can be used to map the pixels from the first image to the second. We present two convolutional neural network architectures for HomographyNet: a regression network which directly estimates the real-valued homography parameters, and a classification network which produces a distribution over quantized homographies. We use a 4-point homography parameterization which maps the four corners from one image into the second image. Our networks are trained in an end-to-end fashion using warped MS-COCO images. Our approach works without the need for separate local feature detection and transformation estimation stages. Our deep models are compared to a traditional homography estimator based on ORB features and we highlight the scenarios where HomographyNet outperforms the traditional technique. We also describe a variety of applications powered by deep homography estimation, thus showcasing the flexibility of a deep learning approach.

322 citations


Posted Content
TL;DR: This work proposes an end-to-end deep learning system to detect cuboids across many semantic categories, and localizes all 3D cuboids (box-like objects) with a 2D bounding box.
Abstract: We present a Deep Cuboid Detector which takes a consumer-quality RGB image of a cluttered scene and localizes all 3D cuboids (box-like objects). Contrary to classical approaches which fit a 3D model from low-level cues like corners, edges, and vanishing points, we propose an end-to-end deep learning system to detect cuboids across many semantic categories (e.g., ovens, shipping boxes, and furniture). We localize cuboids with a 2D bounding box, and simultaneously localize the cuboid's corners, effectively producing a 3D interpretation of box-like objects. We refine keypoints by pooling convolutional features iteratively, improving the baseline method significantly. Our deep learning cuboid detector is trained in an end-to-end fashion and is suitable for real-time applications in augmented reality (AR) and robotics.

29 citations


Patent
05 Dec 2016
TL;DR: In this article, a method of determining the pose of an image capture device using a data structure corresponding to the captured image was proposed. And the method was further compared with a plurality of known data structures to identify a most similar known data structure.
Abstract: A method of determining a pose of an image capture device includes capturing an image using an image capture device. The method also includes generating a data structure corresponding to the captured image. The method further includes comparing the data structure with a plurality of known data structures to identify a most similar known data structure. Moreover, the method includes reading metadata corresponding to the most similar known data structure to determine a pose of the image capture device.

12 citations