scispace - formally typeset
Search or ask a question

Showing papers by "Gary Bradski published in 2011"


Proceedings ArticleDOI
06 Nov 2011
TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.
Abstract: Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.

8,702 citations


Proceedings ArticleDOI
01 Nov 2011
TL;DR: The Clustered Viewpoint Feature Histogram (CVFH) is described and it is shown that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data.
Abstract: This paper focuses on developing a fast and accurate 3D feature for use in object recognition and pose estimation for rigid objects. More specifically, given a set of CAD models of different objects representing our knoweledge of the world - obtained using high-precission scanners that deliver accurate and noiseless data - our goal is to identify and estimate their pose in a real scene obtained by a depth sensor like the Microsoft Kinect. Borrowing ideas from the Viewpoint Feature Histogram (VFH) due to its computational efficiency and recognition performance, we describe the Clustered Viewpoint Feature Histogram (CVFH) and the cameras roll histogram together with our recognition framework to show that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data. We show that CVFH out-performs VFH and present recognition results using the Microsoft Kinect Sensor on an object set of 44 objects.

303 citations


Proceedings ArticleDOI
27 Jun 2011
TL;DR: A new method to model the spatial distribution of oriented local features on an object is presented, which is used to infer object pose given small sets of observed local features.
Abstract: The success of personal service robotics hinges upon reliable manipulation of everyday household objects, such as dishes, bottles, containers, and furniture. In order to accurately manipulate such objects, robots need to know objects’ full 6-DOF pose, which is made difficult by clutter and occlusions. Many household objects have regular structure that can be used to effectively guess object pose given an observation of just a small patch on the object. In this paper, we present a new method to model the spatial distribution of oriented local features on an object, which we use to infer object pose given small sets of observed local features. The orientation distribution for local features is given by a mixture of Binghams on the hypersphere of unit quaternions, while the local feature distribution for position given orientation is given by a locally-weighted (Quaternion kernel) likelihood. Experiments on 3D point cloud data of cluttered and uncluttered scenes generated from a structured light stereo image sensor validate our approach.

86 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: This paper presents the implementation of an architecture that is able to combine a multitude of 2D/3D object recognition and pose estimation techniques in parallel as dynamically loadable plugins, ReIn (REcognition INfrastructure), and introduces two new classifiers designed for robot perception needs.
Abstract: A robust robot perception system intended to enable object manipulation needs to be able to accurately identify objects and their pose at high speeds. Since objects vary considerably in surface properties, rigidity and articulation, no single detector or object estimation method has been shown to provide reliable detection across object types to date. This indicates the need for an architecture that is able to quickly swap detectors, pose estimators, and filters, or to run them in parallel or serial and combine their results, preferably without any code modifications at all. In this paper, we present our implementation of such an infrastructure, ReIn (REcognition INfrastructure), to answer these needs. ReIn is able to combine a multitude of 2D/3D object recognition and pose estimation techniques in parallel as dynamically loadable plugins. It also provides an extremely efficient data passing architecture, and offers the possibility to change the parameters and initial settings of these techniques during their execution. In the course of this work we introduce two new classifiers designed for robot perception needs: BiGGPy (Binarized Gradient Grid Pyramids) for scalable 2D classification and VFH (Viewpoint Feature Histograms) for 3D classification and pose. We then show how these two classifiers can be easily combined using ReIn to solve object recognition and pose identification problems.

55 citations


Proceedings ArticleDOI
16 May 2011
TL;DR: A novel method for solving the challenging problem of generating 3D models of generic object categories from just one single un-calibrated image using the algorithm proposed in [1] which enables a partial reconstruction of the object from a single view.
Abstract: We present a novel method for solving the challenging problem of generating 3D models of generic object categories from just one single un-calibrated image. Our method leverages the algorithm proposed in [1] which enables a partial reconstruction of the object from a single view. A full reconstruction is achieved in a subsequent object completion stage where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. We present results of our method on a number of images containing objects from five generic categories (mice, staplers, mugs, cars, and bicycles). We demonstrate (numerically and qualitatively) that our method produces convincing 3D models from a single image using minimal or no human intervention. Our technique is targeted to applications where users are interested in building virtual collections of 3D models of objects, and sharing such models in virtual environments such as Google 3D Warehouse or Second Life (secondlife.com).

11 citations