scispace - formally typeset
Search or ask a question
Author

Gary Bradski

Other affiliations: Intel, Stanford University, Google
Bio: Gary Bradski is an academic researcher from Willow Garage. The author has contributed to research in topics: Pose & Object (computer science). The author has an hindex of 41, co-authored 82 publications receiving 23763 citations. Previous affiliations of Gary Bradski include Intel & Stanford University.


Papers
More filters
Proceedings ArticleDOI
Gary Bradski1, V. Pisarevsky1
15 Jun 2000
TL;DR: Intel's Microcomputer Research Lab has been developing a highly optimized Computer Vision Library (CVLib) that automatically detects processor type and loads the appropriate MMX/sup TM/ technology assembly tuned module for that processor.
Abstract: Intel's Microcomputer Research Lab has been developing a highly optimized Computer Vision Library (CVLib) that automatically detects processor type and loads the appropriate MMX/sup TM/ technology assembly tuned module for that processor. MMX optimized functions are from 2 to 8 times faster than optimized C functions. We will be demonstrating various algorithms supported by CVLib and handing out CDs containing the library.

113 citations

Posted Content
TL;DR: Kornia is composed of a set of modules containing operators that can be inserted inside neural networks to train models to perform image transformations, camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations.
Abstract: This work presents Kornia, an open source computer vision library built upon a set of differentiable routines and modules that aims to solve generic computer vision problems. The package uses PyTorch as its main backend, not only for efficiency but also to take advantage of the reverse auto-differentiation engine to define and compute the gradient of complex functions. Inspired by OpenCV, Kornia is composed of a set of modules containing operators that can be integrated into neural networks to train models to perform a wide range of operations including image transformations,camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations on graphical processing units, generating faster systems. Examples of classical vision problems implemented using our framework are provided including a benchmark comparing to existing vision libraries.

106 citations

Proceedings Article
06 Jan 2007
TL;DR: This paper presents a novel method for identifying and tracking objects in multiresolution digital video of partially cluttered environments and uses a learned "attentive" interest map on a low resolution data stream to direct a high resolution "fovea".
Abstract: Human object recognition in a physical 3-d environment is still far superior to that of any robotic vision system. We believe that one reason (out of many) for this--one that has not heretofore been significantly exploited in the artificial vision literature--is that humans use a fovea to fixate on, or near an object, thus obtaining a very high resolution image of the object and rendering it easy to recognize. In this paper, we present a novel method for identifying and tracking objects in multiresolution digital video of partially cluttered environments. Our method is motivated by biological vision systems and uses a learned "attentive" interest map on a low resolution data stream to direct a high resolution "fovea." Objects that are recognized in the fovea can then be tracked using peripheral vision. Because object recognition is run only on a small foveal image, our system achieves performance in real-time object recognition and tracking that is well beyond simpler systems.

87 citations

Proceedings ArticleDOI
27 Jun 2011
TL;DR: A new method to model the spatial distribution of oriented local features on an object is presented, which is used to infer object pose given small sets of observed local features.
Abstract: The success of personal service robotics hinges upon reliable manipulation of everyday household objects, such as dishes, bottles, containers, and furniture. In order to accurately manipulate such objects, robots need to know objects’ full 6-DOF pose, which is made difficult by clutter and occlusions. Many household objects have regular structure that can be used to effectively guess object pose given an observation of just a small patch on the object. In this paper, we present a new method to model the spatial distribution of oriented local features on an object, which we use to infer object pose given small sets of observed local features. The orientation distribution for local features is given by a mixture of Binghams on the hypersphere of unit quaternions, while the local feature distribution for position given orientation is given by a locally-weighted (Quaternion kernel) likelihood. Experiments on 3D point cloud data of cluttered and uncluttered scenes generated from a structured light stereo image sensor validate our approach.

86 citations

Patent
14 Mar 2014
TL;DR: In this article, a robotic manipulator may identify characteristics of a physical object within a physical environment and then determine potential grasp points on the physical object corresponding to points at which a gripper attached to the robotic manipulators is operable to grip the object.
Abstract: Example embodiments may relate to methods and systems for selecting a grasp point on an object. In particular, a robotic manipulator may identify characteristics of a physical object within a physical environment. Based on the identified characteristics, the robotic manipulator may determine potential grasp points on the physical object corresponding to points at which a gripper attached to the robotic manipulator is operable to grip the physical object. Subsequently, the robotic manipulator may determine a motion path for the gripper to follow in order to move the physical object to a drop-off location for the physical object and then select a grasp point, from the potential grasp points, based on the determined motion path. After selecting the grasp point, the robotic manipulator may grip the physical object at the selected grasp point with the gripper and move the physical object through the determined motion path to the drop-off location.

84 citations


Cited by
More filters
Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
06 Dec 2004
TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

20,309 citations

Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

17,663 citations

Book
23 May 2011
TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

17,433 citations

Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,727 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.
Abstract: Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection. Our recording platform is equipped with four high resolution video cameras, a Velodyne laser scanner and a state-of-the-art localization system. Our benchmarks comprise 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios (up to 15 cars and 30 pedestrians are visible per image). Results from state-of-the-art algorithms reveal that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world. Our goal is to reduce this bias by providing challenging benchmarks with novel difficulties to the computer vision community. Our benchmarks are available online at: www.cvlibs.net/datasets/kitti

11,283 citations