scispace - formally typeset
Search or ask a question

Showing papers by "Gary Bradski published in 2012"


Book ChapterDOI
05 Nov 2012
TL;DR: A framework for automatic modeling, detection, and tracking of 3D objects with a Kinect and shows how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time.
Abstract: We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.

1,114 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: This work proposes a codebook-free and annotation-free approach for fine-grained image categorization, and proposes a novel bagging-based algorithm to build a final classifier by aggregating a set of discriminative yet largely uncorrelated classifiers.
Abstract: Fine-grained categorization refers to the task of classifying objects that belong to the same basic-level class (e.g. different bird species) and share similar shape or visual appearances. Most of the state-of-the-art basic-level object classification algorithms have difficulties in this challenging problem. One reason for this can be attributed to the popular codebook-based image representation, often resulting in loss of subtle image information that are critical for fine-grained classification. Another way to address this problem is to introduce human annotations of object attributes or key points, a tedious process that is also difficult to generalize to new tasks. In this work, we propose a codebook-free and annotation-free approach for fine-grained image categorization. Instead of using vector-quantized codewords, we obtain an image representation by running a high throughput template matching process using a large number of randomly generated image templates. We then propose a novel bagging-based algorithm to build a final classifier by aggregating a set of discriminative yet largely uncorrelated classifiers. Experimental results show that our method outperforms state-of-the-art classification approaches on the Caltech-UCSD Birds dataset.

170 citations


Book ChapterDOI
07 Oct 2012
TL;DR: Each step of the pipeline is shown, starting with the fast reconstruction of arbitrary 3D objects, followed by the automatic learning and the robust detection and pose estimation of the reconstructed objects in real-time, which makes the framework suitable for object manipulation e.g. in robotics applications.
Abstract: In this technical demonstration, we will show our framework of automatic modeling, detection, and tracking of arbitrary texture-less 3D objects with a Kinect. The detection is mainly based on the recent template-based LINEMOD approach [1] while the automatic template learning from reconstructed 3D models, the fast pose estimation and the quick and robust false positive removal is a novel addition. In this demonstration, we will show each step of our pipeline, starting with the fast reconstruction of arbitrary 3D objects, followed by the automatic learning and the robust detection and pose estimation of the reconstructed objects in real-time. As we will show, this makes our framework suitable for object manipulation e.g. in robotics applications.

119 citations


BookDOI
09 Jul 2012
TL;DR: In this article, a method for segmentation, pose estimation and recognition of transparent objects from a single RGB-D image from a Kinect sensor is proposed, where the weakness in the perception of transparent object is exploited in their segmentation and edge fitting is used for recognition and pose estimation.
Abstract: Recognizing and determining the 6DOF pose of transparent objects is necessary in order for robots to manipulate such objects. However, it is a challenging problem for computer vision. We propose new algorithms for segmentation, pose estimation and recognition of transparent objects from a single RGB-D image from a Kinect sensor. Kinect's weakness in the perception of transparent objects is exploited in their segmentation. Following segmentation, edge fitting is used for recognition and pose estimation. A 3D model of the object is created automatically during training and it is required for pose estimation and recognition. The algorithm is evaluated in different conditions of a domestic environment within the framework of a robotic grasping pipeline where it demonstrates high grasping success rates compared to the state-of-the-art results. The method doesn't deal with occlusions and overlapping transparent objects currently but it is robust against non-transparent clutter.

79 citations


Patent
Gary Bradski1
03 Jan 2012
TL;DR: In this article, a non-binary affinity measure between any two data points for a supervised classifier was determined for tree, kernel-based, nearest neighbor-based and neural network supervised classifiers.
Abstract: A non-binary affinity measure between any two data points for a supervised classifier may be determined. For example, affinity measures may be determined for tree, kernel-based, nearest neighbor-based and neural network supervised classifiers. By providing non-binary affinity measures using supervised classifiers, more information may be provided for clustering, analyzing and, particularly, for visualizing the results of data mining.

8 citations