scispace - formally typeset
Search or ask a question
Author

David G. Lowe

Bio: David G. Lowe is an academic researcher from University of British Columbia. The author has contributed to research in topics: Cognitive neuroscience of visual object recognition & Feature (computer vision). The author has an hindex of 52, co-authored 108 publications receiving 83353 citations. Previous affiliations of David G. Lowe include Courant Institute of Mathematical Sciences & Google.


Papers
More filters
Book
05 Sep 2015
TL;DR: It will be shown that the effective application of the viewpoint consistency constraint, in conjunction with bottom-up image description based upon principles of perceptual organization, can lead to robust three-dimensional object recognition from single gray-scale images.
Abstract: The viewpoint consistency constraint requires that the locations of all object features in an image must be consistent with projection from a single viewpoint. The application of this constraint is central to the problem of achieving robust recognition, since it allows the spatial information in an image to be compared with prior knowledge of an object's shape to the full degree of available image resolution. In addition, the constraint greatly reduces the size of the search space during model-based matching by allowing a few initial matches to provide tight constraints for the locations of other model features. Unfortunately, while simple to state, this constraint has seldom been effectively applied in model-based computer vision systems. This paper reviews the history of attempts to make use of the viewpoint consistency constraint and then describes a number of new techniques for applying it to the process of model-based recognition. A method is presented for probabilistically evaluating new potential matches to extend and refine an initial viewpoint estimate. This evaluation allows the model-based verification process to proceed without the expense of backtracking or search. It will be shown that the effective application of the viewpoint consistency constraint, in conjunction with bottom-up image description based upon principles of perceptual organization, can lead to robust three-dimensional object recognition from single gray-scale images.

214 citations

Journal ArticleDOI
TL;DR: An intelligent system that attempts to perform robust object recognition in a realistic scenario, where a mobile robot moving through an environment must use the images collected from its camera directly to recognise objects.

198 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: Local Naive Bayes Nearest Neighbor (Local NBNN) as discussed by the authors is an improvement to the NBNN image classification algorithm that increases classification accuracy and improves its ability to scale to large numbers of object classes.
Abstract: We present Local Naive Bayes Nearest Neighbor, an improvement to the NBNN image classification algorithm that increases classification accuracy and improves its ability to scale to large numbers of object classes. The key observation is that only the classes represented in the local neighborhood of a descriptor contribute significantly and reliably to their posterior probability estimates. Instead of maintaining a separate search structure for each class's training descriptors, we merge all of the reference data together into one search structure, allowing quick identification of a descriptor's local neighborhood. We show an increase in classification accuracy when we ignore adjustments to the more distant classes and show that the run time grows with the log of the number of classes rather than linearly in the number of classes as did the original. Local NBNN gives a 100 times speed-up over the original NBNN on the Caltech 256 dataset. We also provide the first head-to-head comparison of NBNN against spatial pyramid methods using a common set of input features. We show that local NBNN outperforms all previous NBNN based methods and the original spatial pyramid model. However, we find that local NBNN, while competitive with, does not beat state-of-the-art spatial pyramid methods that use local soft assignment and max-pooling.

198 citations

Book ChapterDOI
TL;DR: This chapter describes a system for constructing 3D metric models from multiple images taken with an uncalibrated handheld camera, recognizing these models in new images, and precisely solving for object pose.
Abstract: Many applications of 3D object recognition, such as augmented reality or robotic manipulation, require an accurate solution for the 3D pose of the recognized objects. This is best accomplished by building a metrically accurate 3D model of the object and all its feature locations, and then fitting this model to features detected in new images. In this chapter, we describe a system for constructing 3D metric models from multiple images taken with an uncalibrated handheld camera, recognizing these models in new images, and precisely solving for object pose. This is demonstrated in an augmented reality application where objects must be recognized, tracked, and superimposed on new images taken from arbitrary viewpoints without perceptible jitter. This approach not only provides for accurate pose, but also allows for integration of features from multiple training images into a single model that provides for more reliable recognition.

196 citations

Journal ArticleDOI
TL;DR: In Sossi's formulation of the Fourier transform method of optical multilayer design the refractive-index profile is derived for an inhomogeneous layer of infinite extent having the desired spectral transmittance.
Abstract: In Sossi's formulation of the Fourier transform method of optical multilayer design the refractive-index profile is derived for an inhomogeneous layer of infinite extent having the desired spectral transmittance. This layer is then approximated by a finite system of discrete homogeneous layers. Because it does not make any assumptions about the refractive indices, thicknesses, or number of layers, it is the most powerful analytical method proposed so far. The method has been programmed for a computer and combined with other numerical design procedures. With the program it is possible to design filters with almost any desired transmittance characteristics using realistic refractive indices.

192 citations


Cited by
More filters
Proceedings ArticleDOI
Jia Deng1, Wei Dong1, Richard Socher1, Li-Jia Li1, Kai Li1, Li Fei-Fei1 
20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Abstract: The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called “ImageNet”, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

49,639 citations

Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Journal ArticleDOI
TL;DR: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis that facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system.
Abstract: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

43,540 citations

Proceedings ArticleDOI
20 Jun 2005
TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

31,952 citations

Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations