Multiclass Object Recognition with Sparse, Localized Features

doi:10.1109/CVPR.2006.200

Proceedings ArticleDOI

Multiclass Object Recognition with Sparse, Localized Features

J. Mutch, +1 more

- Vol. 1, pp 11-18

Chats0

TLDR

A biologically inspired model of visual object recognition to the multiclass object categorization problem, modifies that of Serre, Wolf, and Poggio, and demonstrates the value of retaining some position and scale information above the intermediate feature level.

Abstract:

We apply a biologically inspired model of visual object recognition to the multiclass object categorization problem. Our model modifies that of Serre, Wolf, and Poggio. As in that work, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways, using simple versions of sparsification and lateral inhibition. We demonstrate the value of retaining some position and scale information above the intermediate feature level. Using feature selection we arrive at a model that performs better with fewer features. Our final model is tested on the Caltech 101 object categories and the UIUC car localization task, in both cases achieving state-of-the-art performance. The results strengthen the case for using this class of model in computer vision.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

Honglak Lee, +3 more

TL;DR: The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.

...read moreread less

Proceedings ArticleDOI

What is the best multi-stage architecture for object recognition?

Kevin Jarrett, +3 more

TL;DR: It is shown that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks and that two stages of feature extraction yield better accuracy than one.

...read moreread less

Proceedings ArticleDOI

Convolutional networks and applications in vision

Yann LeCun, +2 more

TL;DR: New unsupervised learning algorithms, and new non-linear stages that allow ConvNets to be trained with very few labeled samples are described, including one for visual object recognition and vision navigation for off-road mobile robots.

...read moreread less

Journal ArticleDOI

Robust Object Recognition with Cortex-Like Mechanisms

Thomas Serre, +4 more

- 01 Mar 2007 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation is described.

...read moreread less

Proceedings ArticleDOI

Representing shape with a spatial pyramid kernel

Anna Bosch, +2 more

TL;DR: This work introduces a descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel that is designed so that the shape correspondence between two images can be measured by the distance between their descriptors using the kernel.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik, +2 more

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.

...read moreread less

Journal ArticleDOI

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Bruno A. Olshausen, +2 more

- 13 Jun 1996 -

Nature

TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.

...read moreread less

Proceedings Article

Visual categorization with bags of keypoints

Gabriela Csurka

TL;DR: This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches and shows that it is simple, computationally efficient and intrinsically invariant.

...read moreread less

Journal ArticleDOI

Neocognitron: A Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position

Kunihiko Fukushima

- 01 Jan 1980 -

Biological Cybernetics

TL;DR: A neural network model for a mechanism of visual pattern recognition that is self-organized by “learning without a teacher”, and acquires an ability to recognize stimulus patterns based on the geometrical similarity of their shapes without affected by their positions.

...read moreread less

Collapse

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

Neocognitron: A Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position

Kunihiko Fukushima

- 01 Jan 1980 -

Biological Cybernetics

Multiclass Object Recognition with Sparse, Localized Features

Citations

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

What is the best multi-stage architecture for object recognition?

Convolutional networks and applications in vision

Robust Object Recognition with Cortex-Like Mechanisms

Representing shape with a spatial pyramid kernel

References

Gradient-based learning applied to document recognition

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Visual categorization with bags of keypoints

Neocognitron: A Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Histograms of oriented gradients for human detection

Gradient-based learning applied to document recognition

Neocognitron: A Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position